0% found this document useful (0 votes)

611 views1,276 pages

SAS - Mathematical Programming

Uploaded by

stoicadoru

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

611 views1,276 pages

SAS - Mathematical Programming

Uploaded by

stoicadoru

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 1276

SAS/OR

9.22 Users Guide

Mathematical Programming
The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2010. SAS/OR

9.22
Users Guide: Mathematical Programming. Cary, NC: SAS Institute Inc.
SAS/OR

9.22 Users Guide: Mathematical Programming

Copyright 2010, SAS Institute Inc., Cary, NC, USA
ISBN 978-1-60764-546-7
All rights reserved. Produced in the United States of America.
For a hard-copy book: No part of this publication may be reproduced, stored in a retrieval system, or
transmitted, in any form or by any means, electronic, mechanical, photocopying, or otherwise, without
the prior written permission of the publisher, SAS Institute Inc.
For a Web download or e-book: Your use of this publication shall be governed by the terms
established by the vendor at the time you acquire this publication.
U.S. Government Restricted Rights Notice: Use, duplication, or disclosure of this software and related
documentation by the U.S. government is subject to the Agreement with SAS Institute and the
restrictions set forth in FAR 52.227-19, Commercial Computer Software-Restricted Rights (June 1987).
SAS Institute Inc., SAS Campus Drive, Cary, North Carolina 27513.
1st electronic book, May 2010
1st printing, May 2010
SAS

Publishing provides a complete selection of books and electronic products to help customers use
SAS software to its fullest potential. For more information about our e-books, e-learning products, CDs,
and hard-copy books, visit the SAS Publishing Web site at support.sas.com/publishing or call 1-800-
727-3228.
SAS

and all other SAS Institute Inc. product or service names are registered trademarks or trademarks
of SAS Institute Inc. in the USA and other countries. indicates USA registration.
Other brand and product names are registered trademarks or trademarks of their respective companies.
Contents
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Chapter 1. Whats New in SAS/OR 9.22 . . . . . . . . . . . . . . . . . . 1
Chapter 2. Using This Book . . . . . . . . . . . . . . . . . . . . . . 7
Chapter 3. Introduction to Optimization . . . . . . . . . . . . . . . . . . 13
Chapter 4. The INTPOINT Procedure . . . . . . . . . . . . . . . . . . . 43
Chapter 5. The LP Procedure . . . . . . . . . . . . . . . . . . . . . . 171
Chapter 6. The NLP Procedure . . . . . . . . . . . . . . . . . . . . . 305
Chapter 7. The NETFLOW Procedure . . . . . . . . . . . . . . . . . . 451
Chapter 8. The OPTMODEL Procedure . . . . . . . . . . . . . . . . . . 717
Chapter 9. The Interior Point NLP Solver . . . . . . . . . . . . . . . . . 867
Chapter 10. The Linear Programming Solver . . . . . . . . . . . . . . . . . 891
Chapter 11. The Mixed Integer Linear Programming Solver . . . . . . . . . . . 923
Chapter 12. The NLPC Nonlinear Optimization Solver . . . . . . . . . . . . . 975
Chapter 13. The Unconstrained Nonlinear Programming Solver . . . . . . . . . . 1015
Chapter 14. The Quadratic Programming Solver (Experimental) . . . . . . . . . 1033
Chapter 15. The Sequential Quadratic Programming Solver . . . . . . . . . . . 1055
Chapter 16. The MPS-Format SAS Data Set . . . . . . . . . . . . . . . . . 1083
Chapter 17. The OPTLP Procedure . . . . . . . . . . . . . . . . . . . . 1101
Chapter 18. The OPTMILP Procedure . . . . . . . . . . . . . . . . . . . 1149
Chapter 19. The OPTQP Procedure . . . . . . . . . . . . . . . . . . . . 1197
Subject Index 1227
Syntax Index 1243
iv
Acknowledgments
Contents
Credits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
Support Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Credits
Documentation
Writing Ioannis Akrotirianakis, Melanie Bain, Hao Cheng,
Matthew Galati, Dmitry V. Golovashkin, Jennie Hu, Tao
Huang, Trevor Kearney, Zhifeng Li, Richard Liu, Michelle
Opp, Girish Ramachandra, Jack Rouse, Ben-Hao Wang,
Kaihong Xu, Yan Xu, Wenwen Zhou
Editing Virginia Clark, Ed Huddleston, Anne Jones, Donna Sawyer
Documentation Support Tim Arnold, Natalie Baerlocher, Melanie Bain, Remya
Chandran, Richard Liu, Jianzhe Luo, Michelle Opp, Girish
Ramachandra
Technical Review Tonya Chapman, Donna Fulenwider, Bill Gjertsen, Tao
Huang, Edward P. Hughes, John Jasperse, Charles B. Kelly,
Radhika Kulkarni, Bengt Pederson, Rob Pratt
Software
The procedures in SAS/OR software were implemented by the Operations Research and Development
Department. Substantial support was given to the project by other members of the Analytical
Solutions Division. Core Development Division, Display Products Division, Graphics Division, and
the Host Systems Division also contributed to this product.
In the following list, the name of the developer(s) currently supporting the procedure is listed.
INTPOINT Trevor Kearney
LP Ben-Hao Wang
NETFLOW Trevor Kearney
NLP Tao Huang
OPTMODEL Jack Rouse
IPNLP Solver Ioannis Akrotirianakis, Joshua Grifn
LP Simplex Solvers Ben-Hao Wang, Yan Xu
LP Iterative Interior Solver Hao Cheng
MILP Solver Philipp Christophel, Amar Narisetty, Yan Xu
NLPC Solver Tao Huang
NLPU Solver Ioannis Akrotirianakis, Joshua Grifn, Wenwen Zhou
QP Solver Hao Cheng
SQP Solver Wenwen Zhou
OPTLP Ben-Hao Wang, Hao Cheng, Yan Xu, Kaihong Xu
OPTQP Hao Cheng, Wenwen Zhou, Kaihong Xu
OPTMILP Philipp Christophel, Amar Narisetty, Yan Xu
MPS-Format SAS Data Set Hao Cheng
ODS Output Kaihong Xu
LP and MILP Pre-solve Yan Xu
vi
QP Pre-solve Wenwen Zhou
Linear Algebra Specialist Alexander Andrianov
Support Groups
Software Testing Wei Huang, Rui Kang, Jennifer Lee, Yu-Min Lin, San-
jeewa Naranpanawe, Bengt Pederson, Rob Pratt, Jonathan
Stephenson, Wei Zhang, Lois Zhu
Technical Support Tonya Chapman
Acknowledgments
Many people have been instrumental in the development of SAS/OR software. The individuals
acknowledged here have been especially helpful.
Richard Brockmeier Union Electric Company
Ken Cruthers Goodyear Tire & Rubber Company
Patricia Duffy Auburn University
Richard A. Ehrhardt University of North Carolina at Greensboro
Paul Hardy Babcock & Wilcox
Don Henderson ORI Consulting Group
Dave Jennings Lockheed Martin
Vidyadhar G. Kulkarni University of North Carolina at Chapel Hill
Wayne Maruska Basin Electric Power Cooperative
Roger Perala United Sugars Corporation
Bruce Reed Auburn University
Charles Rissmiller Lockheed Martin
vii
David Rubin University of North Carolina at Chapel Hill
John Stone North Carolina State University
Keith R. Weiss ICI Americas Inc.
The nal responsibility for the SAS System lies with SAS Institute alone. We hope that you will
always let us know your opinions about the SAS System and its documentation. It is through your
participation that SAS software is continuously improved.
viii
Chapter 1
Whats New in SAS/OR 9.22
Contents
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Highlights of Enhancements in SAS/OR 9.22 . . . . . . . . . . . . . . . . . . 1
Highlights of Enhancements in SAS/OR 9.2 . . . . . . . . . . . . . . . . . 2
SAS/OR Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
The GANTT Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Microsoft Project Conversion Macros . . . . . . . . . . . . . . . . . . . . . . . . 3
The CLP Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
The OPTMODEL Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
The OPTMILP Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
SAS Simulation Studio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Overview
SAS/OR 9.22 continues the improvements that were delivered starting with SAS/OR 9.2. Several
new and enhanced features expand the scale and scope of problems that SAS/OR software can
address. These enhancements also make it easier for you to use the SAS/OR capabilities. Brief
descriptions of these new features are presented in the following sections.
Highlights of Enhancements in SAS/OR 9.22
Highlights of the changes include the following:
v You can customize the format of the time axis on the Gantt chart.
v You can import and convert Microsoft Project data that has been saved in XML format.
v The CLP procedure is now production with the exception of the scheduling related constraints.
v The OPTMODEL procedure supports named problems to enable easy manipulation of multiple
subproblems.
2 ! Chapter 1: Whats New in SAS/OR 9.22
v The IPNLP and NLPU solvers support new techniques for large-scale optimization.
v SAS Simulation Studio 1.5 is a new graphical application for discrete event simulation
and is included with SAS/OR software. Documentation is available at the following link:
http://support.sas.com/documentation/onlinedoc/simstudio/index.html
More information about the changes and enhancements is provided in this chapter. Details can be
found in the relevant volumes of the SAS/OR 9.22 Users Guide and in the SAS Simulation Studio 1.5
Users Guide.
Highlights of Enhancements in SAS/OR 9.2
Some users are moving directly from SAS/OR 9.1.3 to SAS/OR 9.22. The following are some of the
major enhancements that were introduced in SAS/OR 9.2:
v The MPSOUT= option directs procedures to save input problem data in an MPS-format SAS
data set. The MPSOUT= option is available in the LP, NETFLOW, and OPTLP procedures.
v The IIS= option for the LP solver enables you to identify, for an infeasible linear program,
constraints and variable bounds that form an irreducible infeasible set (IIS). The IIS= option is
available in the OPTLP and OPTMODEL procedures.
v The value 2 for the PRINTLEVEL= option directs procedures to produce an ODS table
called ProblemStatistics in addition to the ProblemSummary and SolutionSummary
ODS tables that are produced for PRINTLEVEL=1. The PRINTLEVEL=2 option is available
in the INTPOINT, OPTLP, and OPTMILP procedures.
v The %SASTOMSP macro converts data sets that are used by the CPM and PM procedures
into an MDB le that is readable by Microsoft Project.
v Several call routines in the GA procedure were replaced by new call routines.
v The CLP procedure features improved algorithms for the all-different constraint in addition
to several extensions to the edge-nder algorithm for resource-constrained scheduling.
For more information, see support.sas.com/whatsnewor92.
SAS/OR Documentation
SAS/OR software is documented in the following volumes:
v SAS/OR Users Guide: Bills of Material Processing
v SAS/OR Users Guide: Constraint Programming
The GANTT Procedure ! 3
v SAS/OR Users Guide: Local Search Optimization
v SAS/OR Users Guide: Mathematical Programming
v SAS/OR Users Guide: Project Management
v SAS/OR Users Guide: QSIM Application
v SAS Simulation Studio 1.5: Users Guide
Online help can also be found under the corresponding classication.
The GANTT Procedure
The GANTT procedure produces a Gantt chart, which is a graphical tool for representing schedule-
related information. PROC GANTT provides support for displaying multiple schedules, precedence
relationships, calendar information, milestones, reference lines, labeling, and so on. New in SAS/OR
9.22 is the TIMEAXISFORMAT= option in the CHART statement which provides the capability
to customize the format of the time axis on the Gantt chart for up to three rows. Each row can be
formulated using a predened SAS format or a user-dened format.
Microsoft Project Conversion Macros
The SAS macro %MSPTOSAS converts Microsoft Project 98 (and later) data into SAS data sets
that can be used as input for project scheduling with SAS/OR software. This macro generates the
necessary SAS data sets, determines the values of the relevant options, and invokes the SAS/OR PM
procedure with the converted project data. The %MSPTOSAS macro enables you to use Microsoft
Project for the input of project data and still take advantage of the excellent SAS/OR project and
resource scheduling capabilities. New in SAS/OR 9.22 is the capability to import and convert
Microsoft Project data that has been saved in XML format. This feature is experimental.
The experimental %SASTOMSP macro converts data sets that are used by the CPM and PM
procedures into a Microsoft Access Database (MDB) le that is readable by Microsoft Project. The
macro converts information that is common to PROC CPM, PROC PM, and Microsoft Project; this
information includes hierarchical relationships, precedence relationships, time constraints, resource
availabilities, resource requirements, project calendars, resource calendars, task calendars, holiday
information, and work-shift information. In addition, the early and late schedules, the actual start
and nish times, the resource-constrained schedule, and the baseline schedule are also extracted and
stored as start-nish variables.
Execution of the %MSPTOSAS and %SASTOMSP macros requires SAS/ACCESS

software.
4 ! Chapter 1: Whats New in SAS/OR 9.22
The CLP Procedure
The CLP procedure is a nite-domain constraint programming solver for solving constraint satisfac-
tion problems (CSPs) with linear, logical, global, and scheduling constraints. The CLP procedure is
production in SAS/OR 9.22 with the exception of the scheduling-related constraints.
New in SAS/OR 9.22 are the GCC and ELEMENT statements for dening global cardinality
constraints (GCC) and element constraints, respectively. The GCC statement enables you to bound
the number of times that a specic value gets assigned to a set of variables. The ELEMENT statement
enables you to dene dependencies, not necessarily functional, between variables and to dene
noncontiguous domains.
The USECONDATAVARS= option enables you to implicitly dene numeric variables in the
CONDATA= data set. The TIMETYPE= option enables you to set the units (real time or CPU
time) of the MAXTIME= parameter. The _ORCLP_ macro variable has been enhanced to provide
more information about procedure status and solution status.
There are also several changes and enhancements to the scheduling capabilities in SAS/OR 9.22.
Support for multiple-capacity resources has been added in the RESOURCE statement and the
Activity data set. The REQUIRES statement syntax for specifying multiple resource requirements
has changed. The format of the Activity data set has changed to a more compact form with a xed
number of variables. A new Resource data set, specied with the RESDATA= option, enables you
to dene resources, resource pools, and resource attributes in compact form. The format of the
Schedule data set has been enhanced to separate time and schedule related observations. Two new
schedule-related output data sets, SCHEDTIME= and SCHEDRES=, have been added; they contain
time assignment and resource assignment information, respectively.
The OPTMODEL Procedure
The OPTMODEL procedure provides a modeling environment that is tailored to building, solving,
and maintaining optimization models. This makes the process of translating the symbolic formulation
of an optimization model into PROC OPTMODEL virtually transparent, because the modeling
language mimics the symbolic algebra of the formulation as closely as possible. PROC OPTMODEL
also streamlines and simplies the critical process of populating optimization models with data
from SAS data sets. All of this transparency produces models that are more easily inspected for
completeness and correctness, more easily corrected, and more easily modied, whether through
structural changes or through the substitution of new data for old data.
The OPTMODEL procedure consists of the powerful OPTMODEL modeling language and access to
state-of-the-art solvers for several classes of mathematical programming problems.
Seven solvers are available to OPTMODEL, as listed in Table 1.1.
The OPTMODEL Procedure ! 5
Table 1.1 List of OPTMODEL Solvers
Problem Solver
Linear programming LP
Mixed integer programming MILP
Quadratic programming (Experimental) QP
Nonlinear programming, unconstrained NLPU
General nonlinear programming NLPC
General nonlinear programming SQP
General nonlinear programming IPNLP
In SAS/OR 9.22, the OPTMODEL procedure adds several new features. First, PROC OPTMODEL
supports named problems to enable easy manipulation of multiple subproblems. The PROBLEM
declaration declares a named problem and the USE PROBLEM statement makes it active. Objectives
can now be declared as arrays, so they can provide separate objectives for arrays of named problems.
Implicit variables, created via the IMPVAR declaration, allow optimization expressions to be referred
to by name in a model. Implicit variables can be evaluated more efciently than by repeating the
same complex expression in multiple places.
Problem components can be accessed with aliases such as _VAR_ and _CON_, which respectively
aggregate all of the variables and constraints in a problem. This allows convenient processing of all
of the problem components of a given kind for printing, model expansion, and other purposes. The
new sufxes .NAME and .LABEL can be used to track the identity of problem components.
Function and subroutine calls can use the OF array-name[*] syntax to pass an OPTMODEL array
to a called routine for uses such as sorting.
The NUMBER, STRING, and SET declarations allow initial values for arrays to be supplied using
an INIT clause with a list of initialization values.
The SOLVE statement supports the RELAXINT keyword to solve a problem while temporarily
relaxing the integrality restriction on variables.
Analytic derivatives are now generated for most SAS library functions. The OPTMODEL procedure
can use threading on systems with multiple processors to speed up evaluation of nonlinear Hessian
models.
Starting with SAS/OR 9.22, the IPNLP and NLPU solvers support new techniques for large-scale
optimization. The nonlinear solver IPNLP has been equipped with two new techniques. The rst
technique, TECH=IPKRYLOV, is appropriate for large-scale nonlinear optimization problems that
can contain many thousands of variables or constraints or both. It uses exact second derivatives
to calculate the search directions. Its convergence is achieved by using a trust-region framework
that guides the algorithm towards the solution of the optimization problem. The second technique,
TECH=IPQN, uses a quasi-Newton method and line-search framework to solve the optimization
problem. As such it needs to calculate only the rst derivatives of the objective and constraints.
This method is more appropriate for problems where the second derivatives of the objective and
constraints either are not available or are expensive to compute.
The unconstrained solver NLPU has been equipped with a new technique called TECH=CGTR.
This technique uses the conjugate gradient method to solve large-scale unconstrained and bound
constrained optimization problems.
6 ! Chapter 1: Whats New in SAS/OR 9.22
The OPTMILP Procedure
The OPTMILP procedure solves mixed-integer linear programming problems with a linear-
programming-based branch-and-bound algorithm that has been improved for SAS/OR 9.22. The
algorithmic improvements result from incorporating new techniques in the presolver and cutting
planes, better application of primal heuristics, an improved branch-and-bound strategy, and an
improved strategy for handling feasibility problems. Improvements to the presolver include variable
and constraint reductions based on logical implications among binary variables and generalized
variable substitutions. Two new cutting plane routines (mixed 0-1 lifted inequalities and zero-half
cuts) have been added, and improvements have been made to clique, Gomory mixed integer, and
mixed integer rounding (MIR) cutting plane routines.
The resulting improvements in efciency enable you to use PROC OPTMILP to solve larger and
more complex optimization problems in a shorter time than with previous SAS/OR releases.
SAS Simulation Studio
SAS Simulation Studio is a discrete event simulation application for modeling the operation of call
centers, supply chains, emergency rooms, and other real-world systems in which there are signicant
random elements (timing and length of events, requirements, and so on). Its graphical user interface
provides a full set of tools and components for building, executing, and analyzing the data that are
generated by discrete event simulation models. SAS Simulation Studio provides extensive modeling
and analysis tools suitable for both novice and advanced simulation users.
SAS Simulation Studio integrates fully with JMP software to provide experimental design capabilities
for evaluating and analyzing your simulation models. Any of the JMP and SAS statistical analysis
tools can be used, either to analyze results after the simulation model is run or to perform embedded
analyses that occur while the simulation model is running.
SAS Simulation Studio 1.5 has been included with SAS/OR software since its release in August
2009.
Chapter 2
Using This Book
Contents
Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Typographical Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Conventions for Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Accessing the SAS/OR Sample Library . . . . . . . . . . . . . . . . . . . . 10
Online Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Additional Documentation for SAS/OR Software . . . . . . . . . . . . . . . . . . 10
Purpose
SAS/OR Users Guide: Mathematical Programming provides a complete reference for the mathemat-
ical programming procedures in SAS/OR software. This book serves as the primary documentation
for the INTPOINT, LP, NETFLOW, and NLP procedures, in addition to the new OPTLP, OPTMILP,
OPTMODEL, and OPTQP procedures, the various solvers used by PROC OPTMODEL, and the
MPS-format sas data set specication.
Using This Book describes the organization of this book and the conventions used in the text
and example code. To gain full benet from using this book, you should familiarize yourself with
the information presented in this section and refer to it when needed. The section Additional
Documentation for SAS/OR Software on page 10 refers to other documents that contain related
information.
Organization
Chapter 3 contains a brief overview of the mathematical programming procedures in SAS/OR
software and provides an introduction to optimization and the use of the optimization tools in the
SAS System. That chapter also describes the ow of data between the procedures and how the
components of the SAS System t together.
8 ! Chapter 2: Using This Book
After the introductory chapter, the next ve chapters describe the INTPOINT, LP, NETFLOW,
NLP, and OPTMODEL procedures. The next seven chapters describe the interior point nonlinear
programming, linear programming, mixed integer linear programming, nonlinear optimization, un-
constrained nonlinear programming, quadratic programming, and sequential quadratic programming
solvers, which are used by the OPTMODEL procedure. The next chapter is the specication of
the newly introduced MPS-format SAS data set. The last three chapters describe the new OPTLP,
OPTMILP, and OPTQP procedures for solving linear programming, mixed linear programming, and
quadratic programming problems, respectively. Each procedure description is self-contained; you
need to be familiar with only the basic features of the SAS System and SAS terminology to use most
procedures. The statements and syntax necessary to run each procedure are presented in a uniform
format throughout this book.
The following list summarizes the types of information provided for each procedure:
Overview provides a general description of what the procedure does.
It outlines major capabilities of the procedure and lists all
input and output data sets that are used with it.
Getting Started illustrates simple uses of the procedure using a few short
examples. It provides introductory hands-on information
for the procedure.
Syntax constitutes the major reference section for the syntax of the
procedure. First, the statement syntax is summarized. Next,
a functional summary table lists all the statements and
options in the procedure, classied by function. In addition,
the online version includes a Dictionary of Options, which
provides an alphabetical list of all options. Following these
tables, the PROC statement is described, and then all other
statements are described in alphabetical order.
Details describes the features of the procedure, including algorith-
mic details and computational methods. It also explains
how the various options interact with each other. This sec-
tion describes input and output data sets in greater detail,
with denitions of the output variables, and explains the
format of printed output, if any.
Examples consists of examples that are designed to illustrate the use
of the procedure. Each example includes a description of
the problem and lists the options that are highlighted by
the example. The example shows the data and the SAS
statements needed, and includes the output produced. You
can duplicate the examples by copying the statements and
data and running the SAS program. The SAS Sample
Library contains the code used to run the examples shown
in this book; consult your SAS Software representative for
specic information about the Sample Library.
Conventions for Examples ! 9
References lists references that are relevant to the chapter.
Typographical Conventions
The printed version of SAS/OR Users Guide: Mathematical Programming uses various type styles,
as explained by the following list:
roman is the standard type style used for most text.
UPPERCASE ROMAN is used for SAS statements, options, and other SAS lan-
guage elements when they appear in the text. However,
you can enter these elements in your own SAS code in
lowercase, uppercase, or a mixture of the two. This style
is also used for identifying arguments and values (in the
syntax specications) that are literals (for example, to
denote valid keywords for a specic option).
UPPERCASE BOLD is used in the Syntax section to identify SAS keywords,
such as the names of procedures, statements, and options.
VariableName is used for the names of SAS variables and data sets when
they appear in the text.
oblique is used to indicate an option variable for which you must
supply a value (for example, DUPLICATE= dup indicates
that you must supply a value for dup).
italic is used for terms that are dened in the text, for emphasis,
and for publication titles.
monospace is used to show examples of SAS statements. In most
cases, this book uses lowercase type for SAS code. You
can enter your own SAS code in lowercase, uppercase, or
a mixture of the two.
Conventions for Examples
Most of the output shown in this book is produced with the following SAS System options:
options linesize=80 pagesize=60 nonumber nodate;
10 ! Chapter 2: Using This Book
Accessing the SAS/OR Sample Library
The SAS/OR sample library includes many examples that illustrate the use of SAS/OR software,
including the examples used in this documentation. To access these sample programs from the SAS
windowing environment, select Help from the main menu and then select Getting Started with
SAS Software. On the Contents tab, expand the Learning to Use SAS, Sample SAS Programs,
and SAS/OR items. Then click Samples.
Online Documentation
This documentation is available online with the SAS System. To access SAS/OR documentation
from the SAS windowing environment, select Help from the main menu and then select SAS Help
and Documentation. (Alternatively, you can type help OR in the command line.) On the Contents
tab, expand the SAS Products and SAS/OR items. Then expand the book you want to view. You
can search the documentation by using the Search tab.
You can also access the documentation by going to http://support.sas.com/documentation.
Additional Documentation for SAS/OR Software
In addition to SAS/OR Users Guide: Mathematical Programming, you may nd these other docu-
ments helpful when using SAS/OR software:
SAS/OR Users Guide: Bill of Material Processing
provides documentation for the BOM procedure and all bill of material postprocessing SAS
macros. The BOM procedure and SAS macros provide the ability to generate different reports
and to perform several transactions to maintain and update bills of material.
SAS/OR Users Guide: Constraint Programming
provides documentation for the constraint programming procedure in SAS/OR software. This
book serves as the primary documentation for the CLP procedure.
SAS/OR Users Guide: Local Search Optimization
provides documentation for the local search optimization procedure in SAS/OR software.
This book serves as the primary documentation for the GA procedure, which uses genetic
algorithms to solve optimization problems.
SAS/OR Users Guide: Project Management
provides documentation for the project management procedures in SAS/OR software. This
book serves as the primary documentation for the CPM, DTREE, GANTT, NETDRAW, and
PM procedures, as well as the PROJMAN Application, a graphical user interface for project
management.
Additional Documentation for SAS/OR Software ! 11
SAS/OR Users Guide: The QSIM Application
provides documentation for the QSIM application, which is used to build and analyze models
of queueing systems using discrete event simulation. This book shows you how to build
models using the simple point-and-click graphical user interface, how to run the models, and
how to collect and analyze the sample data to give you insight into the behavior of the system.
SAS/OR Software: Project Management Examples, Version 6
contains a series of examples that illustrate how to use SAS/OR software to manage projects.
Each chapter contains a complete project management scenario and describes how to use
PROC GANTT, PROC CPM, and PROC NETDRAW, in addition to other reporting and
graphing procedures in the SAS System, to perform the necessary project management tasks.
SAS Simulation Studio 1.5: Users Guide
provides documentation on using SAS Simulation Studio, a graphical application for creating
and working with discrete-event simulation models. This book describes in detail how to build
and run simulation models and how to interact with SAS software for analysis and with JMP
software for experimental design and analysis.
12
Chapter 3
Introduction to Optimization
Contents
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Linear Programming Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
PROC OPTLP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
PROC OPTMODEL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
PROC LP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
PROC INTPOINT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Network Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
PROC NETFLOW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
PROC INTPOINT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Mixed Integer Linear Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
PROC OPTMILP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
PROC OPTMODEL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
PROC LP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Quadratic Programming Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 19
PROC OPTQP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
PROC OPTMODEL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Nonlinear Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
PROC OPTMODEL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
PROC NLP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Model Building . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
PROC OPTLP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
PROC NETFLOW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
PROC OPTMODEL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Matrix Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Exploiting Model Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Report Writing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
The DATA Step . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Other Reporting Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . 39
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
14 ! Chapter 3: Introduction to Optimization
Overview
Operations Research tools are directed toward the solution of resource management and planning
problems. Models in Operations Research are representations of the structure of a physical object or
a conceptual or business process. Using the tools of Operations Research involves the following:
v dening a structural model of the system under investigation
v collecting the data for the model
v analyzing the model
SAS/OR software is a set of procedures for exploring models of distribution networks, production
systems, resource allocation problems, and scheduling problems using the tools of Operations
Research.
The following list suggests some of the application areas where optimization-based decision support
systems have been used. In practice, models often contain elements of several applications listed
here.
v Product-mix problems nd the mix of products that generates the largest return when there
are several products competing for limited resources.
v Blending problems nd the mix of ingredients to be used in a product so that it meets
minimum standards at minimum cost.
v Time-staged problems are models whose structure repeats as a function of time. Production
and inventory models are classic examples of time-staged problems. In each period, production
plus inventory minus current demand equals inventory carried to the next period.
v Scheduling problems assign people to times, places, or tasks so as to optimize peoples
preferences or performance while satisfying the demands of the schedule.
v Multiple objective problems have multiple, possibly conicting, objectives. Typically, the
objectives are prioritized and the problems are solved sequentially in a priority order.
v Capital budgeting and project selection problems ask for the project or set of projects that
will yield the greatest return.
v Location problems seek the set of locations that meets the distribution needs at minimum
cost.
v Cutting stock problems nd the partition of raw material that minimizes waste and fullls
demand.
A problem is formalized with the construction of a model to represent it. These models, called
mathematical programs, are represented in SAS data sets and then solved using SAS/OR procedures.
The solution of mathematical programs is called mathematical programming. Since mathematical
Linear Programming Problems ! 15
programs are represented in SAS data sets, they can be saved, easily changed, and re-solved. The
SAS/OR procedures also output SAS data sets containing the solutions. These can then be used to
produce customized reports. In addition, this structure enables you to build decision support systems
using the tools of Operations Research and other tools in the SAS System as building blocks.
The basic optimization problem is that of minimizing or maximizing an objective function subject to
constraints imposed on the variables of that function. The objective function and constraints can be
linear or nonlinear; the constraints can be bound constraints, equality or inequality constraints, or
integer constraints. Traditionally, optimization problems are divided into linear programming (LP;
all functions and constraints are linear) and nonlinear programming (NLP).
The data describing the model are supplied to an optimizer (such as one of the procedures described
in this book), an optimizing algorithm is used to determine the optimal values for the decision
variables so the objective is either maximized or minimized, the optimal values assigned to decision
variables are on or between allowable bounds, and the constraints are obeyed. Determining the
optimal values is the process called optimization.
This chapter describes how to use SAS/OR software to solve a wide variety of optimization problems.
We describe various types of optimization problems, indicate which SAS/OR procedure you can use,
and show how you provide data, run the procedure, and obtain optimal solutions.
In the next section we broadly classify the SAS/OR procedures based on the types of mathematical
programming problems they can solve.
Linear Programming Problems
PROC OPTLP
PROC OPTLP solves linear programming problems that are submitted either in an MPS-format le
or in an MPS-format SAS data set.
The MPS le format is a format commonly used for describing linear programming (LP) and integer
programming (IP) problems (Murtagh 1981; IBM 1988). MPS-format les are in text format and
have specic conventions for the order in which the different pieces of the mathematical model are
specied. The MPS-format SAS data set corresponds closely to the MPS le format and is used
to describe linear programming problems for PROC OPTLP. For more details, refer to Chapter 16,
The MPS-Format SAS Data Set.
PROC OPTLP provides three solvers to solve the LP: primal simplex, dual simplex, and interior
point. The simplex solvers implement a two-phase simplex method, and the interior point solver
implements a primal-dual predictor-corrector algorithm. For more details refer to Chapter 17, The
OPTLP Procedure.
16 ! Chapter 3: Introduction to Optimization
PROC OPTMODEL
PROC OPTMODEL provides a language for concisely modeling linear programming problems.
The language allows a model to be expressed in a form that matches the mathematical formulation.
Within OPTMODEL you can declare a model, pass it directly to various solvers, and review the
solver result. You can also save an instance of a linear model in data set form for use by PROC
OPTLP. For more details, refer to Chapter 8, The OPTMODEL Procedure.
PROC LP
The LP procedure solves linear and mixed integer programs with a primal simplex solver. It can
perform several types of post-optimality analysis, including range analysis, sensitivity analysis, and
parametric programming. The procedure can also be used interactively.
PROC LP requires a problem data set that contains the model. In addition, a primal and active data
set can be used for warm starting a problem that has been partially solved previously.
The problem data describing the model can be in one of two formats: dense or sparse. The dense
format represents the model as a rectangular coefcient matrix. The sparse format, on the other hand,
represents only the nonzero elements of a rectangular coefcient matrix.
For more details on the LP procedure, refer to Chapter 5, The LP Procedure.
Problem data specied in the format used by the LP procedure can be readily reformatted for use
with the newer OPTLP procedure. The MPSOUT= option in the LP procedure enables you to
convert data in the format used by the LP procedure into an MPS-format SAS data set for use with
the OPTLP procedure. For more information about the OPTLP procedure, see Chapter 17, The
OPTLP Procedure. For more information about the MPS-format SAS data set, see Chapter 16, The
MPS-Format SAS Data Set.
PROC INTPOINT
The INTPOINT procedure solves linear programming problems using the interior point algorithm.
The constraint data can be specied in either the sparse or dense input format. This is the same
format that is used by PROC LP; therefore, any model-building techniques that apply to models for
PROC LP also apply to PROC INTPOINT.
For more details on PROC INTPOINT refer to Chapter 4, The INTPOINT Procedure.
Problem data specied in the format used by the INTPOINT procedure can be readily reformatted
for use with the newer OPTLP procedure. The MPSOUT= option in the INTPOINT procedure
enables you to convert data in the format used by the INTPOINT procedure into an MPS-format
SAS data set for use with the OPTLP procedure. For more information about the OPTLP procedure,
Network Problems ! 17
see Chapter 17, The OPTLP Procedure. For more information about the MPS-format SAS data set,
see Chapter 16, The MPS-Format SAS Data Set.
Network Problems
PROC NETFLOW
The NETFLOW procedure solves network ow problems with linear side constraints using either a
network simplex algorithm or an interior point algorithm. In addition, it can solve linear programming
(LP) problems using the interior point algorithm.
Networks and the Network Simplex Algorithm
PROC NETFLOWs network simplex algorithm solves pure network ow problems and network ow
problems with linear side constraints. The procedure accepts the network specication in formats
that are particularly suited to networks. Although network problems could be solved by PROC LP,
the NETFLOW procedure generally solves network ow problems more efciently than PROC LP.
Network ow problems, such as nding the minimum cost ow in a network, require model
representation in a format that is specialized for network structures. The network is represented in
two data sets: a node data set that names the nodes in the network and gives supply and demand
information at them, and an arc data set that denes the arcs in the network using the node names
and gives arc costs and capacities. In addition, a side-constraint data set is included that gives any
side constraints that apply to the ow through the network. Examples of these are found later in this
chapter.
The constraint data can be specied in either the sparse or dense input format. This is the same
format that is used by PROC LP; therefore, any model-building techniques that apply to models for
PROC LP also apply to network ow models having side constraints.
Problem data specied in the format used by the NETFLOW procedure can be readily reformatted
for use with the newer OPTLP procedure. The MPSOUT= option in the NETFLOW procedure
enables you to convert data in the format used by the NETFLOW procedure into an MPS-format
SAS data set for use with the OPTLP procedure. For more information about the OPTLP procedure,
see Chapter 17, The OPTLP Procedure. For more information about the MPS-format SAS data set,
see Chapter 16, The MPS-Format SAS Data Set.
Linear and Network Programs Solved by the Interior Point Algorithm
The data required by PROC NETFLOW for a linear program resemble the data for nonarc variables
and constraints for constrained network problems. They are similar to the data required by PROC LP.
18 ! Chapter 3: Introduction to Optimization
The LP representation requires a data set that denes the variables in the LP using variable names,
and gives objective function coefcients and upper and lower bounds. In addition, a constraint data
set can be included that species any constraints.
When solving a constrained network problem, you can specify the INTPOINT option to indicate
that the interior point algorithm is to be used. The input data are the same whether the simplex or
interior point method is used. The interior point method is often faster when problems have many
side constraints.
The constraint data can be specied in either the sparse or dense input format. This is the same
format that is used by PROC LP; therefore, any model-building techniques that apply to models for
PROC LP also apply to LP models solved by PROC NETFLOW.
Problem data specied in the format used by the NETFLOW procedure can be readily reformatted
for use with the newer OPTLP procedure. The MPSOUT= option in the NETFLOW procedure
enables you to convert data in the format used by the NETFLOW procedure into an MPS-format
SAS data set for use with the OPTLP procedure. For more information about the OPTLP procedure,
see Chapter 17, The OPTLP Procedure. For more information about the MPS-format SAS data set,
see Chapter 16, The MPS-Format SAS Data Set.
PROC INTPOINT
The INTPOINT procedure solves the Network Program with Side Constraints (NPSC) problem using
the interior point algorithm.
The data required by PROC INTPOINT are similar to the data required by PROC NETFLOW when
solving network ow models using the interior point algorithm.
The constraint data can be specied in either the sparse or dense input format. This is the same
format that is used by PROC LP and PROC NETFLOW; therefore, any model-building techniques
that apply to models for PROC LP or PROC NETFLOW also apply to PROC INTPOINT.
For more details on PROC INTPOINT refer to Chapter 4, The INTPOINT Procedure.
Mixed Integer Linear Problems
PROC OPTMILP
The OPTMILP procedure solves general mixed integer linear programs (MILPs) linear programs
in which a subset of the decision variables are constrained to be integers. The OPTMILP procedure
solves MILPs with an LP-based branch-and-bound algorithm augmented by advanced techniques
such as cutting planes and primal heuristics.
PROC OPTMODEL ! 19
The OPTMILP procedure requires a MILP to be specied using a SAS data set that adheres to the
MPS format. See Chapter 16, The MPS-Format SAS Data Set, for details about the MPS-format
data set.
PROC OPTMODEL
PROC OPTMODEL provides a language for concisely modeling mixed integer linear programming
problems. The language allows a model to be expressed in a form that matches the mathematical
formulation. Within OPTMODEL you can declare a model, pass it directly to various solvers, and
review the solver result. You can also save an instance of a mixed integer linear model in data
set form for use by PROC OPTMILP. For more details, refer to Chapter 8, The OPTMODEL
Procedure.
PROC LP
The LP procedure solves MILPs with a primal simplex solver. To solve a MILP you need to identify
the integer variables. You can do this with a row in the input data set that has the keyword INTEGER
for the type variable. It is important to note that integer variables must have upper bounds explicitly
dened.
As with linear programs, you can specify MIP problem data using sparse or dense format. For more
details see Chapter 5, The LP Procedure.
Quadratic Programming Problems
PROC OPTQP
The OPTQP procedure solves quadratic programsproblems with quadratic objective function and
a collection of linear constraints, including general linear constraints along with lower and/or upper
bounds on the decision variables.
You can specify the problem input data in one of two formats: QPS-format at le or QPS-format
SAS data set. For details on the QPS-format data specication, refer to Chapter 16, The MPS-
Format SAS Data Set. For more details on the OPTQP procedure, refer to Chapter 19, The OPTQP
Procedure.
20 ! Chapter 3: Introduction to Optimization
PROC OPTMODEL
PROC OPTMODEL provides a language for concisely modeling quadratic programming problems.
The language allows a model to be expressed in a form that matches the mathematical formulation.
Within OPTMODEL you can declare a model, pass it directly to various solvers, and review the
solver result. You can also save an instance of a quadratic model in data set form for use by PROC
OPTQP. For more details, refer to Chapter 8, The OPTMODEL Procedure.
Nonlinear Problems
PROC OPTMODEL
PROC OPTMODEL provides a language for concisely modeling nonlinear programming (NLP)
problems. The language allows a model to be expressed in a form that matches the mathematical
formulation. Within OPTMODEL you can declare a model, pass it directly to various solvers, and
review the solver result. For more details, refer to Chapter 8, The OPTMODEL Procedure.
You can solve the following types of nonlinear programming problems using PROC OPTMODEL:
v Nonlinear objective function, linear constraints: Invoke the constrained nonlinear program-
ming (NLPC) solver. For more details about the NLPC solver, refer to Chapter 12, The NLPC
Nonlinear Optimization Solver.
v Nonlinear objective function, nonlinear constraints: Invoke the sequential programming
(SQP) or interior point nonlinear programming (IPNLP) solver. For more details about the
SQP solver, refer to Chapter 15, The Sequential Quadratic Programming Solver. For more
details about the IPNLP solver, refer to Chapter 9, The Interior Point NLP Solver.
v Nonlinear objective function, no constraints: Invoke the unconstrained nonlinear program-
ming (NLPU) solver. For more details about the NLPU solver, refer to Chapter 13, The
Unconstrained Nonlinear Programming Solver.
PROC NLP
The NLP procedure (NonLinear Programming) offers a set of optimization techniques for minimizing
or maximizing a continuous nonlinear function subject to linear and nonlinear, equality and inequality,
and lower and upper bound constraints. Problems of this type are found in many settings ranging
from optimal control to maximum likelihood estimation.
Model Building ! 21
Nonlinear programs can be input into the procedure in various ways. The objective, constraint, and
derivative functions are specied using the programming statements of PROC NLP. In addition,
information in SAS data sets can be used to dene the structure of objectives and constraints, and to
specify constants used in objectives, constraints, and derivatives.
PROC NLP uses the following data sets to input various pieces of information:
v The DATA= data set enables you to specify data shared by all functions involved in a least
squares problem.
v The INQUAD= data set contains the arrays appearing in a quadratic programming problem.
v The INEST= data set species initial values for the decision variables, the values of con-
stants that are referred to in the program statements, and simple boundary and general linear
constraints.
v The MODEL= data set species a model (functions, constraints, derivatives) saved at a
previous execution of the NLP procedure.
As an alternative to supplying data in SAS data sets, some or all data for the model can be specied
using SAS programming statements. These are similar to those used in the SAS DATA step.
For more details on PROC NLP refer to Chapter 6, The NLP Procedure.
Model Building
Model generation and maintenance are often difcult and expensive aspects of applying mathematical
programming techniques. The exible input formats for the optimization procedures in SAS/OR
software simplify this task.
PROC OPTLP
A small product-mix problem serves as a starting point for a discussion of different types of model
formats supported in SAS/OR software.
A candy manufacturer makes two products: chocolates and toffee. What combination of chocolates
and toffee should be produced in a day in order to maximize the companys prot? Chocolates
contribute $0.25 per pound to prot, and toffee contributes $0.75 per pound. The decision variables
are chocolates and toffee.
Four processes are used to manufacture the candy:
1. Process 1 combines and cooks the basic ingredients for both chocolates and toffee.
22 ! Chapter 3: Introduction to Optimization
2. Process 2 adds colors and avors to the toffee, then cools and shapes the confection.
3. Process 3 chops and mixes nuts and raisins, adds them to the chocolates, and then cools and
cuts the bars.
4. Process 4 is packaging: chocolates are placed in individual paper shells; toffee is wrapped in
cellophane packages.
During the day, there are 7.5 hours (27,000 seconds) available for each process.
Firm time standards have been established for each process. For Process 1, mixing and cooking take
15 seconds for each pound of chocolate, and 40 seconds for each pound of toffee. Process 2 takes
56.25 seconds per pound of toffee. For Process 3, each pound of chocolate requires 18.75 seconds of
processing. In packaging, a pound of chocolates can be wrapped in 12 seconds, whereas a pound of
toffee requires 50 seconds. These data are summarized as follows:
Available Required per Pound
Time chocolates toffee
Process (sec) (sec) (sec)
1 Cooking 27,000 15 40
2 Color/Flavor 27,000 56.25
3 Condiments 27,000 18.75
4 Packaging 27,000 12 50
The objective is to
Maximize: 0.25(chocolates) + 0.75(toffee)
which is the companys total prot.
The production of the candy is limited by the time available for each process. The limits placed on
production by Process 1 are expressed by the following inequality:
Process 1: 15(chocolates) + 40(toffee) _ 27,000
Process 1 can handle any combination of chocolates and toffee that satises this inequality.
The limits on production by other processes generate constraints described by the following inequali-
ties:
Process 2: 56.25(toffee) _ 27,000
Process 3: 18.75(chocolates) _ 27,000
Process 4: 12(chocolates) + 50(toffee) _ 27,000
This linear program illustrates the type of problem known as a product mix example. The mix of
products that maximizes the objective without violating the constraints is the solution. This model
can be represented in an MPS-format SAS data set.
PROC OPTLP ! 23
Dense Format
The following DATA step creates a SAS data set for this product mix problem. Notice that the
values of CHOCO and TOFFEE in the data set are the coefcients of those variables in the equations
corresponding to the objective function and constraints. The variable _id_ contains a character
string that names the rows in the data set. The variable _type_ is a character variable that contains
keywords that describe the type of each row in the problem data set. The variable _rhs_ contains the
right-hand-side values.
data factory;
input _id_ $ CHOCO TOFFEE _type_ $ _rhs_;
datalines;
object 0.25 0.75 MAX .
process1 15.00 40.00 LE 27000
process2 0.00 56.25 LE 27000
process3 18.75 0.00 LE 27000
process4 12.00 50.00 LE 27000
;
To solve this problem by using PROC LP, specify the following:
proc lp data = factory;
run;
Sparse Format
Typically, mathematical programming models are sparse; that is, few of the coefcients in the
constraint matrix are nonzero. The OPTLP procedure accepts data in an MPS-format SAS data set,
which is an efcient way to represent sparse models. Only the nonzero coefcients must be specied.
It is consistent with the standard MPS sparse format, and much more exible; models using the MPS
format can be easily converted to the LP format. The appendix at the end of this book describes a
SAS macro for conversion.
An example of an MPS-format SAS data set is illustrated here. The following data set contains the
data from the product mix problem of the preceding section.
data sp_factory;
format _type_ $8. _row_ $10. _col_ $10.;
input _type_ $_row_ $ _col_ $ _coef_;
datalines;
max object . .
. object chocolate .25
. object toffee .75
le process1 . .
. process1 chocolate 15
. process1 toffee 40
. process1 _RHS_ 27000
le process2 . .
. process2 toffee 56.25
. process2 _RHS_ 27000
le process3 . .
24 ! Chapter 3: Introduction to Optimization
. process3 chocolate 18.75
. process3 _RHS_ 27000
le process4 . .
. process4 chocolate 12
. process4 toffee 50
. process4 _RHS_ 27000
;
To solve this problem by using PROC OPTLP, specify the following:
proc lp data=sp_factory sparsedata;
run;
The Solution Summary (shown in Figure 3.1) gives information about the solution that was found,
including whether the optimizer terminated successfully after nding the optimum.
When PROC OPTLP solves a problem, it uses an iterative process. First, the procedure nds a
feasible solution that satises the constraints. Second, it nds the optimal solution from the set of
feasible solutions. The Solution Summary lists information about the optimization process such as
the number of iterations, the infeasibilities of the solution, and the time required to solve the problem.
Figure 3.1 Solution Summary
The LP Procedure
Solution Summary
Terminated Successfully
Objective Value 475
Phase 1 Iterations 0
Phase 2 Iterations 3
Phase 3 Iterations 0
Integer Iterations 0
Integer Solutions 0
Initial Basic Feasible Variables 6
Time Used (seconds) 0
Number of Inversions 3
Epsilon 1E-8
Infinity 1.797693E308
Maximum Phase 1 Iterations 100
Maximum Phase 2 Iterations 100
Maximum Phase 3 Iterations 99999999
Maximum Integer Iterations 100
Time Limit (seconds) 120
Separating the Data from the Model Structure
It is often desirable to keep the data separate from the structure of the model. This is useful for large
models with numerous identiable components. The data are best organized in rectangular tables
that can be easily examined and modied. Then, before the problem is solved, the model is built
PROC OPTLP ! 25
using the stored data. This process of model building is known as matrix generation. In conjunction
with the sparse format, the SAS DATA step provides a good matrix generation language.
For example, consider the candy manufacturing example introduced previously. Suppose that, for the
user interface, it is more convenient to organize the data so that each record describes the information
related to each product (namely, the contribution to the objective function and the unit amount needed
for each process). A DATA step for saving the data might look like this:
data manfg;
format product $12.;
input product $ object process1 - process4 ;
datalines;
chocolate .25 15 0.00 18.75 12
toffee .75 40 56.25 0.00 50
licorice 1.00 29 30.00 20.00 20
jelly_beans .85 10 0.00 30.00 10
_RHS_ . 27000 27000 27000 27000
;
Notice that there is a special record at the end having product _RHS_. This record gives the amounts
of time available for each of the processes. This information could have been stored in another data
set. The next example illustrates a model where the data are stored in separate data sets.
Building the model involves adding the data to the structure. There are as many ways to do this as
there are programmers and problems. The following DATA step shows one way to use the candy
data to build a sparse format model to solve the product mix problem.
data model;
array process object process1-process4;
format _type_ $8. _row_ $12. _col_ $12. ;
keep _type_ _row_ _col_ _coef_;
set manfg; /
*
read the manufacturing data
*
/
/
*
build the object function
*
/
if _n_=1 then do;
_type_='max'; _row_='object'; _col_=' '; _coef_=.;
output;
end;
/
*
build the constraints
*
/
do over process;
if _i_>1 then do;
_type_='le'; _row_='process'||put(_i_-1,1.);
end;
else _row_='object';
_col_=product; _coef_=process;
output;
end;
run;
26 ! Chapter 3: Introduction to Optimization
The sparse format data set is shown in Figure 3.2.
Figure 3.2 Sparse Data Format
Obs _type_ _row_ _col_ _coef_
1 max object .
2 max object chocolate 0.25
3 le process1 chocolate 15.00
4 le process2 chocolate 0.00
5 le process3 chocolate 18.75
6 le process4 chocolate 12.00
7 object toffee 0.75
8 le process1 toffee 40.00
9 le process2 toffee 56.25
10 le process3 toffee 0.00
11 le process4 toffee 50.00
12 object licorice 1.00
13 le process1 licorice 29.00
14 le process2 licorice 30.00
15 le process3 licorice 20.00
16 le process4 licorice 20.00
17 object jelly_beans 0.85
18 le process1 jelly_beans 10.00
19 le process2 jelly_beans 0.00
20 le process3 jelly_beans 30.00
21 le process4 jelly_beans 10.00
22 object _RHS_ .
23 le process1 _RHS_ 27000.00
24 le process2 _RHS_ 27000.00
25 le process3 _RHS_ 27000.00
26 le process4 _RHS_ 27000.00
The model data set looks a little different from the sparse representation of the candy model shown
earlier. It not only includes additional products (licorice and jelly beans), but it also denes the
model in a different order. Since the sparse format is robust, the model can be generated in ways that
are convenient for the DATA step program.
If the problem had more products, you could increase the size of the manfg data set to include the
new product data. Also, if the problem had more than four processes, you could add the new process
variables to the manfg data set and increase the size of the process array in the model data set. With
these two simple changes and additional data, a product mix problem having hundreds of processes
and products can be solved.
PROC NETFLOW
Network ow problems can be described by specifying the nodes in the network and their supplies
and demands, and the arcs in the network and their costs, capacities, and lower ow bounds. Consider
the simple transshipment problem in Figure 3.3 as an illustration.
PROC NETFLOW ! 27
Figure 3.3 Transshipment Problem
factory_2
factory_1
warehouse_2
warehouse_1
customer_3
customer_2
customer_1
500
500
50
200
100
Suppose the candy manufacturing company has two factories, two warehouses, and three customers
for chocolate. The two factories each have a production capacity of 500 pounds per day. The three
customers have demands of 100, 200, and 50 pounds per day, respectively.
The following data set describes the supplies (positive values for the supdem variable) and the
demands (negative values for the supdem variable) for each of the customers and factories.
data nodes;
format node $10. ;
input node $ supdem;
datalines;
customer_1 -100
customer_2 -200
customer_3 -50
factory_1 500
factory_2 500
;
Suppose that there are two warehouses that are used to store the chocolate before shipment to the
customers, and that there are different costs for shipping between each factory, warehouse, and
customer. What is the minimum cost routing for supplying the customers?
Arcs are described in another data set. Each observation denes a new arc in the network and
gives data about the arc. For example, there is an arc between the node factory_1 and the node
warehouse_1. Each unit of ow on that arc costs 10. Although this example does not include it,
lower and upper bounds on the ow across that arc can be listed here.
28 ! Chapter 3: Introduction to Optimization
data network;
format from $12. to $12.;
input from $ to $ cost ;
datalines;
factory_1 warehouse_1 10
factory_2 warehouse_1 5
factory_1 warehouse_2 7
factory_2 warehouse_2 9
warehouse_1 customer_1 3
warehouse_1 customer_2 4
warehouse_1 customer_3 4
warehouse_2 customer_1 5
warehouse_2 customer_2 5
warehouse_2 customer_3 6
;
You can use PROC NETFLOW to nd the minimum cost routing. This procedure takes the model as
dened in the network and nodes data sets and nds the minimum cost ow.
proc netflow arcout=arc_sav
arcdata=network nodedata=nodes;
node node; /
*
node data set information
*
/
supdem supdem;
tail from; /
*
arc data set information
*
/
head to;
cost cost;
run;
proc print;
var from to cost _capac_ _lo_ _supply_ _demand_
_flow_ _fcost_ _rcost_;
sum _fcost_;
run;
PROC NETFLOW produces the following messages in the SAS log:
NOTE: Number of nodes= 7 .
NOTE: Number of supply nodes= 2 .
NOTE: Number of demand nodes= 3 .
NOTE: Total supply= 1000 , total demand= 350 .
NOTE: Number of arcs= 10 .
NOTE: Number of iterations performed (neglecting any constraints)= 9 .
NOTE: Of these, 2 were degenerate.
NOTE: Optimum (neglecting any constraints) found.
NOTE: Minimal total cost= 3050 .
NOTE: The data set WORK.ARC_SAV has 10 observations and 13 variables.
The solution (Figure 3.4) saved in the arc_sav data set shows the optimal amount of chocolate to send
PROC NETFLOW ! 29
across each arc (the amount to ship from each factory to each warehouse and from each warehouse
to each customer) in the network per day.
Figure 3.4 ARCOUT Data Set
_ _
_ S D _ _
C U E _ F R
A P M F C C
f c P _ P A L O O
O r o A L L N O S S
b o t s C O Y D W T T
s m o t _ _ _ _ _ _ _
1 warehouse_1 customer_1 3 99999999 0 . 100 100 300 .
2 warehouse_2 customer_1 5 99999999 0 . 100 0 0 4
3 warehouse_1 customer_2 4 99999999 0 . 200 200 800 .
4 warehouse_2 customer_2 5 99999999 0 . 200 0 0 3
5 warehouse_1 customer_3 4 99999999 0 . 50 50 200 .
6 warehouse_2 customer_3 6 99999999 0 . 50 0 0 4
7 factory_1 warehouse_1 10 99999999 0 500 . 0 0 5
8 factory_2 warehouse_1 5 99999999 0 500 . 350 1750 .
9 factory_1 warehouse_2 7 99999999 0 500 . 0 0 .
10 factory_2 warehouse_2 9 99999999 0 500 . 0 0 2
====
3050
Notice which arcs have positive ow (_FLOW_ is greater than 0). These arcs indicate the amount of
chocolate that should be sent from factory_2 to warehouse_1 and from there to the three customers.
The model indicates no production at factory_1 and no use of warehouse_2.
Figure 3.5 Optimal Solution for the Transshipment Problem
factory_2
factory_1
warehouse_2
warehouse_1
customer_3
customer_2
customer_1
500
500
50
200
100
350 50
100
200
30 ! Chapter 3: Introduction to Optimization
PROC OPTMODEL
Modeling a Linear Programming Problem
Consider the candy manufacturers problem described in the section PROC OPTLP on page 21.
You can formulate the problem using PROC OPTMODEL and solve it using the primal simplex
solver as follows:
proc optmodel;
/
*
declare variables
*
/
var choco, toffee;
/
*
maximize objective function (profit)
*
/
maximize profit = 0.25
*
choco + 0.75
*
toffee;
/
*
subject to constraints
*
/
con process1: 15
*
choco + 40
*
toffee <= 27000;
con process2: 56.25
*
toffee <= 27000;
con process3: 18.75
*
choco <= 27000;
con process4: 12
*
choco + 50
*
toffee <= 27000;
/
*
solve LP using primal simplex solver
*
/
solve with lp / solver = primal_spx;
/
*
display solution
*
/
print choco toffee;
quit;
The optimal objective value and the optimal solution are displayed in the following summary output:
The OPTMODEL Procedure
Solution Summary
Solver Primal Simplex
Objective Function profit
Solution Status Optimal
Objective Value 475
Iterations 3
Primal Infeasibility 0
Dual Infeasibility 0
Bound Infeasibility 0
You can observe from the preceding example that PROC OPTMODEL provides an easy and intuitive
way of modeling and solving mathematical programming models.
PROC OPTMODEL ! 31
Modeling a Nonlinear Programming Problem
The following optimization problem illustrates how you can use some features of PROC OPTMODEL
to formulate and solve nonlinear programming problems. The objective of the problem is to nd
coefcients for an approximation function that matches the values of a given function, (.), at
a set of points 1. The approximation is a rational function with degree J in the numerator and
denominator:
r(.) =

0

d
i=1

i
.
i

d
i=1

i
.
i
The problem can be formulated by minimizing the sum of squared errors at each point in 1:
min

x1
r(.) (.)|
2
The following code implements this model. The function (.) = 2
x
is approximated over a set
of points 1 in the range 0 to 1. The function values are saved in a data set that is used by PROC
OPTMODEL to set model parameters:
data points;
/
*
generate data points
*
/
keep f x;
do i = 0 to 100;
x = i/100;
f = 2
**
x;
output;
end;
proc optmodel;
/
*
declare, read, and save our data points
*
/
set points;
number f{points};
read data points into points = [x] f;
/
*
declare variables and model parameters
*
/
number d=1; /
*
linear polynomial
*
/
var a{0..d};
var b{0..d} init 1;
constraint fixb0: b[0] = 1;
/
*
minimize sum of squared errors
*
/
min z=sum{x in points}
((a[0] + sum{i in 1..d} a[i]
*
x
**
i) /
(b[0] + sum{i in 1..d} b[i]
*
x
**
i) - f[x])
**
2;
/
*
solve and show coefficients
*
/
solve;
print a b;
quit;
32 ! Chapter 3: Introduction to Optimization
The expression for the objective z is dened using operators that parallel the mathematical form. In
this case the polynomials in the rational function are linear, so J is equal to 1.
The constraint xb0 forces the constant term of the rational function denominator, b[0], to equal 1.
This causes the resulting coefcients to be normalized. The OPTMODEL presolver preprocesses the
problem to remove the constraint. An unconstrained solver is used after substituting for b[0].
The SOLVE statement selects a solver, calls it, and displays the status. The PRINT command then
prints the values of coefcient arrays a and b:
The OPTMODEL Procedure
Solution Summary
Solver NLPU/LBFGS
Objective Function z
Solution Status Optimal
Objective Value 0.0000590999
Iterations 11
Optimality Error 6.5769853E-7
The approximation for (.) = 2
x
between 0 and 1 is therefore

approx
(.) =
0.99817 0.42064.
1 0.29129.
Matrix Generation
It is desirable to keep data in separate tables, and then to automate model building and reporting.
This example illustrates a problem that has elements of both a product mix problem and a blending
problem. Suppose four kinds of ties are made: all silk, all polyester, a 50-50 polyester-cotton blend,
and a 70-30 cotton-polyester blend.
The data include cost and supplies of raw material, selling price, minimum contract sales, maximum
demand of the nished products, and the proportions of raw materials that go into each product. The
objective is to nd the product mix that maximizes prot.
The data are saved in three SAS data sets. The program that follows demonstrates one way for these
data to be saved.
Matrix Generation ! 33
data material;
format descpt $20.;
input descpt $ cost supply;
datalines;
silk_material .21 25.8
polyester_material .6 22.0
cotton_material .9 13.6
;
data tie;
format descpt $20.;
input descpt $ price contract demand;
datalines;
all_silk 6.70 6.0 7.00
all_polyester 3.55 10.0 14.00
poly_cotton_blend 4.31 13.0 16.00
cotton_poly_blend 4.81 6.0 8.50
;
data manfg;
format descpt $20.;
input descpt $ silk poly cotton;
datalines;
all_silk 100 0 0
all_polyester 0 100 0
poly_cotton_blend 0 50 50
cotton_poly_blend 0 30 70
;
The following program takes the raw data from the three data sets and builds a linear program model
in the data set called model. Although it is designed for the three-resource, four-product problem
described here, it can easily be extended to include more resources and products. The model-building
DATA step remains essentially the same; all that changes are the dimensions of loops and arrays. Of
course, the data tables must expand to accommodate the new data.
data model;
array raw_mat {3} $ 20 ;
array raw_comp {3} silk poly cotton;
length _type_ $ 8 _col_ $ 20 _row_ $ 20 _coef_ 8 ;
keep _type_ _col_ _row_ _coef_ ;
/
*
define the objective, lower, and upper bound rows
*
/
_row_='profit'; _type_='max'; output;
_row_='lower'; _type_='lowerbd'; output;
_row_='upper'; _type_='upperbd'; output;
_type_=' ';
/
*
the object and upper rows for the raw materials
*
/
do i=1 to 3;
set material;
34 ! Chapter 3: Introduction to Optimization
raw_mat[i]=descpt; _col_=descpt;
_row_='profit'; _coef_=-cost; output;
_row_='upper'; _coef_=supply; output;
end;
/
*
the object, upper, and lower rows for the products
*
/
do i=1 to 4;
set tie;
_col_=descpt;
_row_='profit'; _coef_=price; output;
_row_='lower'; _coef_=contract; output;
_row_='upper'; _coef_=demand; output;
end;
/
*
the coefficient matrix for manufacturing
*
/
_type_='eq';
do i=1 to 4; /
*
loop for each raw material
*
/
set manfg;
do j=1 to 3; /
*
loop for each product
*
/
_col_=descpt; /
*
% of material in product
*
/
_row_ = raw_mat[j];
_coef_ = raw_comp[j]/100;
output;
_col_ = raw_mat[j]; _coef_ = -1;
output;
/
*
the right-hand side
*
/
if i=1 then do;
_col_='_RHS_';
_coef_=0;
output;
end;
end;
_type_=' ';
end;
stop;
run;
The model is solved using PROC LP, which saves the solution in the PRIMALOUT data set named
solution. PROC PRINT displays the solution, shown in Figure 3.6.
proc lp sparsedata primalout=solution;
proc print ;
id _var_;
var _lbound_--_r_cost_;
run;
Exploiting Model Structure ! 35
Figure 3.6 Solution Data Set
_VAR_ _LBOUND_ _VALUE_ _UBOUND_ _PRICE_ _R_COST_
all_polyester 10 11.800 14.0 3.55 0.000
all_silk 6 7.000 7.0 6.70 6.490
cotton_material 0 13.600 13.6 -0.90 4.170
cotton_poly_blend 6 8.500 8.5 4.81 0.196
polyester_material 0 22.000 22.0 -0.60 2.950
poly_cotton_blend 13 15.300 16.0 4.31 0.000
silk_material 0 7.000 25.8 -0.21 0.000
PHASE_1_OBJECTIVE 0 0.000 0.0 0.00 0.000
profit 0 168.708 1.7977E308 0.00 0.000
The solution shows that 11.8 units of polyester ties, 7 units of silk ties, 8.5 units of the cotton-
polyester blend, and 15.3 units of the polyester-cotton blend should be produced. It also shows the
amounts of raw materials that go into this product mix to generate a total prot of 168.708.
Exploiting Model Structure
Another example helps to illustrate how the model can be simplied by exploiting the structure in
the model when using the NETFLOW procedure.
Recall the chocolate transshipment problem discussed previously. The solution required no pro-
duction at factory_1 and no storage at warehouse_2. Suppose this solution, although optimal, is
unacceptable. An additional constraint requiring the production at the two factories to be balanced is
needed. Now, the production at the two factories can differ by, at most, 100 units. Such a constraint
might look like this:
-100 <= (factory_1_warehouse_1 + factory_1_warehouse_2 -
factory_2_warehouse_1 - factory_2_warehouse_2) <= 100
The network and supply and demand information are saved in the following two data sets:
data network;
format from $12. to $12.;
input from $ to $ cost ;
datalines;
factory_1 warehouse_1 10
factory_2 warehouse_1 5
factory_1 warehouse_2 7
factory_2 warehouse_2 9
warehouse_1 customer_1 3
warehouse_1 customer_2 4
warehouse_1 customer_3 4
warehouse_2 customer_1 5
warehouse_2 customer_2 5
36 ! Chapter 3: Introduction to Optimization
warehouse_2 customer_3 6
;
data nodes;
format node $12. ;
input node $ supdem;
datalines;
customer_1 -100
customer_2 -200
customer_3 -50
factory_1 500
factory_2 500
;
The factory-balancing constraint is not a part of the network. It is represented in the sparse format in
a data set for side constraints.
data side_con;
format _type_ $8. _row_ $8. _col_ $21. ;
input _type_ _row_ _col_ _coef_ ;
datalines;
eq balance . .
. balance factory_1_warehouse_1 1
. balance factory_1_warehouse_2 1
. balance factory_2_warehouse_1 -1
. balance factory_2_warehouse_2 -1
. balance diff -1
lo lowerbd diff -100
up upperbd diff 100
;
This data set contains an equality constraint that sets the value of DIFF to be the amount that factory
1 production exceeds factory 2 production. It also contains implicit bounds on the DIFF variable.
Note that the DIFF variable is a nonarc variable.
You can use the following call to PROC NETFLOW to solve the problem:
proc netflow
conout=con_sav
arcdata=network nodedata=nodes condata=side_con
sparsecondata ;
node node;
supdem supdem;
tail from;
head to;
cost cost;
run;
proc print;
var from to _name_ cost _capac_ _lo_ _supply_ _demand_
_flow_ _fcost_ _rcost_;
sum _fcost_;
run;
The solution is saved in the con_sav data set, as displayed in Figure 3.7.
Exploiting Model Structure ! 37
Figure 3.7 CON_SAV Data Set
_ _
_ S D _ _
_ C U E _ F R
N A P M F C C
f A c P _ P A L O O
O r M o A L L N O S S
b o t E s C O Y D W T T
s m o _ t _ _ _ _ _ _ _
1 warehouse_1 customer_1 3 99999999 0 . 100 100 300 .
2 warehouse_2 customer_1 5 99999999 0 . 100 0 0 1.0
3 warehouse_1 customer_2 4 99999999 0 . 200 75 300 .
4 warehouse_2 customer_2 5 99999999 0 . 200 125 625 .
5 warehouse_1 customer_3 4 99999999 0 . 50 50 200 .
6 warehouse_2 customer_3 6 99999999 0 . 50 0 0 1.0
7 factory_1 warehouse_1 10 99999999 0 500 . 0 0 2.0
8 factory_2 warehouse_1 5 99999999 0 500 . 225 1125 .
9 factory_1 warehouse_2 7 99999999 0 500 . 125 875 .
10 factory_2 warehouse_2 9 99999999 0 500 . 0 0 5.0
11 diff 0 100 -100 . . -100 0 1.5
====
3425
Notice that the solution now has production balanced across the factories; the production at factory 2
exceeds that at factory 1 by 100 units.
Figure 3.8 Constrained Optimum for the Transshipment Problem
factory_2
factory_1
warehouse_2
warehouse_1
customer_3
customer_2
customer_1
500
500
50
200
100
225
125
50
100
75
125
38 ! Chapter 3: Introduction to Optimization
Report Writing
The reporting of the solution is also an important aspect of modeling. Since the optimization
procedures save the solution in one or more SAS data sets, reports can be written using any of the
tools in the SAS language.
The DATA Step
Use of the DATA step and PROC PRINT is the most common way to produce reports. For example,
from the data set solution shown in Figure 3.6, a table showing the revenue of the optimal production
plan and a table of the cost of material can be produced with the following program.
data product(keep= _var_ _value_ _price_ revenue)
material(keep=_var_ _value_ _price_ cost);
set solution;
if _price_>0 then do;
revenue=_price_
*
_value_; output product;
end;
else if _price_<0 then do;
_price_=-_price_;
cost = _price_
*
_value_; output material;
end;
run;
/
*
display the product report
*
/
proc print data=product;
id _var_;
var _value_ _price_ revenue ;
sum revenue;
title 'Revenue Generated from Tie Sales';
run;
/
*
display the materials report
*
/
proc print data=material;
id _var_;
var _value_ _price_ cost;
sum cost;
title 'Cost of Raw Materials';
run;
This DATA step reads the solution data set saved by PROC LP and segregates the records based
on whether they correspond to materials or productsnamely whether the contribution to prot is
positive or negative. Each of these is then displayed to produce Figure 3.9.
Other Reporting Procedures ! 39
Figure 3.9 Tie Problem: Revenues and Costs
Revenue Generated from Tie Sales
_VAR_ _VALUE_ _PRICE_ revenue
all_polyester 11.8 3.55 41.890
all_silk 7.0 6.70 46.900
cotton_poly_blend 8.5 4.81 40.885
poly_cotton_blend 15.3 4.31 65.943
=======
195.618
Cost of Raw Materials
_VAR_ _VALUE_ _PRICE_ cost
cotton_material 13.6 0.90 12.24
polyester_material 22.0 0.60 13.20
silk_material 7.0 0.21 1.47
=====
26.91
Other Reporting Procedures
The GCHART procedure can be a useful tool for displaying the solution to mathematical program-
ming models. The con_solv data set that contains the solution to the balanced transshipment problem
can be effectively displayed using PROC GCHART. In Figure 3.10, the amount that is shipped from
each factory and warehouse can be seen by submitting the following SAS code:
title;
proc gchart data=con_sav;
hbar from / sumvar=_flow_;
run;
40 ! Chapter 3: Introduction to Optimization
Figure 3.10 Tie Problem: Throughputs
The horizontal bar chart is just one way of displaying the solution to a mathematical program. The
solution to the Tie Product Mix problem that was solved using PROC LP can also be illustrated using
PROC GCHART. Here, a pie chart shows the relative contribution of each product to total revenues.
proc gchart data=product;
pie _var_ / sumvar=revenue;
title 'Projected Tie Sales Revenue';
run;
References ! 41
Figure 3.11 Tie Problem: Projected Tie Sales Revenue
The TABULATE procedure is another procedure that can help automate solution reporting. Several
examples in Chapter 5, The LP Procedure, illustrate its use.
References
IBM (1988), Mathematical Programming System Extended/370 (MPSX/370) Version 2 Program
Reference Manual, volume SH19-6553-0, IBM.
Murtagh, B. A. (1981), Advanced Linear Programming, Computation and Practice, New York:
McGraw-Hill.
Rosenbrock, H. H. (1960), An Automatic Method for Finding the Greatest or Least Value of a
Function, Computer Journal, 3, 175184.
42
Chapter 4
The INTPOINT Procedure
Contents
Overview: INTPOINT Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Mathematical Description of NPSC . . . . . . . . . . . . . . . . . . . . . . 45
Mathematical Description of LP . . . . . . . . . . . . . . . . . . . . . . . . . 47
The Interior Point Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Network Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Getting Started: NPSC Problems . . . . . . . . . . . . . . . . . . . . . . . 63
Getting Started: LP Problems . . . . . . . . . . . . . . . . . . . . . . . . . 70
Typical PROC INTPOINT Run . . . . . . . . . . . . . . . . . . . . . . . . 78
Syntax: INTPOINT Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Functional Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
PROC INTPOINT Statement . . . . . . . . . . . . . . . . . . . . . . . . . . 81
CAPACITY Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
COEF Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
COLUMN Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
COST Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
DEMAND Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
HEADNODE Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
ID Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
LO Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
NAME Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
NODE Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
QUIT Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
RHS Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
ROW Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
RUN Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
SUPDEM Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
SUPPLY Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
TAILNODE Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
TYPE Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
VAR Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
Details: INTPOINT Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
Input Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
Output Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
44 ! Chapter 4: The INTPOINT Procedure
Converting Any PROC INTPOINT Format to an MPS-Format SAS Data Set 120
Case Sensitivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Loop Arcs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
Multiple Arcs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
Flow and Value Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
Tightening Bounds and Side Constraints . . . . . . . . . . . . . . . . . . . 122
Reasons for Infeasibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
Missing S Supply and Missing D Demand Values . . . . . . . . . . . . . . . 123
Balancing Total Supply and Total Demand . . . . . . . . . . . . . . . . . . . 127
How to Make the Data Read of PROC INTPOINT More Efcient . . . . . . 129
Stopping Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
Examples: INTPOINT Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
Example 4.1: Production, Inventory, Distribution Problem . . . . . . . . . . 139
Example 4.2: Altering Arc Data . . . . . . . . . . . . . . . . . . . . . . . . 144
Example 4.3: Adding Side Constraints . . . . . . . . . . . . . . . . . . . . 149
Example 4.4: Using Constraints and More Alteration to Arc Data . . . . . . 154
Example 4.5: Nonarc Variables in the Side Constraints . . . . . . . . . . . . 159
Example 4.6: Solving an LP Problem with Data in MPS Format . . . . . . . 166
Example 4.7: Converting to an MPS-Format SAS Data Set . . . . . . . . . . 167
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
Overview: INTPOINT Procedure
The INTPOINT procedure solves the Network Program with Side Constraints (NPSC) problem
(dened in the section Mathematical Description of NPSC on page 45) and the more general Linear
Programming (LP) problem (dened in the section Mathematical Description of LP on page 47).
NPSC and LP models can be used to describe a wide variety of real-world applications ranging from
production, inventory, and distribution problems to nancial applications.
Whether your problem is NPSC or LP, PROC INTPOINT uses the same optimization algorithm, the
interior point algorithm. This algorithm is outlined in the section The Interior Point Algorithm on
page 47.
While many of your problems may best be formulated as LP problems, there may be other instances
when your problems are better formulated as NPSC problems. The section Network Models on
page 55 describes typical models that have a network component and suggests reasons why NPSC
may be preferable to LP. The section Getting Started: NPSC Problems on page 63 outlines how
you supply data of any NPSC problem to PROC INTPOINT and call the procedure. After it reads
the NPSC data, PROC INTPOINT converts the problem into an equivalent LP problem, performs
interior point optimization, then converts the solution it nds back into a form you can use as the
optimum to the original NPSC model.
Mathematical Description of NPSC ! 45
If your model is an LP problem, the way you supply the data to PROC INTPOINT and run the
procedure is described in the section Getting Started: LP Problems on page 70.
You can also solve LP problems by using the OPTLP procedure. The OPTLP procedure requires
a linear program to be specied by using a SAS data set that adheres to the MPS format, a widely
accepted format in the optimization community. You can use the MPSOUT= option in the INTPOINT
procedure to convert typical PROC INTPOINT format data sets into MPS-format SAS data sets.
The remainder of this chapter is organized as follows:
v The section Typical PROC INTPOINT Run on page 78 describes how to use this procedure.
v The section Syntax: INTPOINT Procedure on page 79 describes all the statements and
options of PROC INTPOINT.
v The section Functional Summary on page 79 lists the statements and options that can be
used to control PROC INTPOINT.
v The section Details: INTPOINT Procedure on page 109 contains detailed explanations,
descriptions, and advice on the use and behavior of the procedure.
v PROC INTPOINT is demonstrated by solving several examples in the section Examples:
INTPOINT Procedure on page 137.
Mathematical Description of NPSC
A network consists of a collection of nodes joined by a collection of arcs. The arcs connect nodes
and convey ow of one or more commodities that are supplied at supply nodes and demanded at
demand nodes in the network. Each arc has a cost per unit of ow, a ow capacity, and a lower
ow bound associated with it. An important concept in network modeling is conservation of ow.
Conservation of ow means that the total ow in arcs directed toward a node, plus the supply at the
node, minus the demand at the node, equals the total ow in arcs directed away from the node.
Often all the details of a problem cannot be specied in a network model alone. In many of these cases,
these details can be represented by the addition of side constraints to the model. Side constraints
are linear functions of arc variables (variables containing ow through an arc) and nonarc variables
(variables that are not part of the network). The data for a side constraint consist of coefcients of
arcs and coefcients of nonarc variables, a constraint type (that is, _, =, or _) and a right-hand-side
value (rhs). A nonarc variable has a name, an objective function coefcient analogous to an arc cost,
an upper bound analogous to an arc capacity, and a lower bound analogous to an arc lower ow
bound.
If a network component of NPSC is removed by merging arcs and nonarc variables into a single set
of variables, and if the ow conservation constraints and side constraints are merged into a single set
of constraints, the result is an LP problem. PROC INTPOINT will automatically transform an NPSC
problem into an equivalent LP problem, perform the optimization, then transform the problem back
into its original form. By doing this, PROC INTPOINT nds the ow through the network and the
values of any nonarc variables that minimize the total cost of the solution. Flow conservation is met,
46 ! Chapter 4: The INTPOINT Procedure
ow through each arc is on or between the arcs lower ow bound and capacity, the value of each
nonarc variable is on or between the nonarcs lower and upper bounds, and the side constraints are
satised.
Note that, since many LPs have large embedded networks, PROC INTPOINT is an attractive
alternative to the LP procedure in many cases. Rather than formulating all problems as LPs, network
models remain conceptually easy since they are based on network diagrams that represent the problem
pictorially. PROC INTPOINT accepts the network specication in a format that is particularly suited
to networks. This not only simplies problem description but also aids in the interpretation of the
solution. The conversion to and from the equivalent LP is done behind the scenes by the procedure.
If a network programming problem with side constraints has n nodes, a arcs, g nonarc variables, and
k side constraints, then the formal statement of the problem solved by PROC INTPOINT is
minimize c
T
. J
T
:
subject to J. = b
H. Q: {_. =. _] r
l _ . _ u
m _ : _
where
v c is the a 1 arc variable objective function coefcient vector (the cost vector)
v . is the a 1 arc variable value vector (the ow vector)
v J is the g 1 nonarc variable objective function coefcient vector
v : is the g 1 nonarc variable value vector
v J is the n a node-arc incidence matrix of the network, where
J
i,}
=
_
_
_
1. if arc is directed from node i
1. if arc is directed toward node i
0. otherwise
v b is the n 1 node supply/demand vector, where
b
i
=
_
_
_
s. if node i has supply capability of s units of ow
J. if node i has demand of J units of ow
0. if node i is a transshipment node
v H is the k a side constraint coefcient matrix for arc variables, where H
i,}
is the coefcient
of arc in the i th side constraint
v Q is the k g side constraint coefcient matrix for nonarc variables, where Q
i,}
is the
coefcient of nonarc in the i th side constraint
v r is the k 1 side constraint right-hand-side vector
Mathematical Description of LP ! 47
v l is the a 1 arc lower ow bound vector
v u is the a 1 arc capacity vector
v m is the g 1 nonarc variable lower bound vector
v is the g 1 nonarc variable upper bound vector
The INTPOINT procedure can also be used to solve an unconstrained network problem, that is, one
in which H, Q, J, r, and : do not exist. It can also be used to solve a network problem with side
constraints but no nonarc variables, in which case Q, J, and : do not exist.
Mathematical Description of LP
A linear programming (LP) problem has a linear objective function and a collection of linear
constraints. PROC INTPOINT nds the values of variables that minimize the total cost of the
solution. The value of each variable is on or between the variables lower and upper bounds, and the
constraints are satised.
If an LP has g variables and k constraints, then the formal statement of the problem solved by PROC
INTPOINT is
minimize J
T
:
subject to Q: {_. =. _] r
m _ : _
where
v J is the g 1 variable objective function coefcient vector
v : is the g 1 variable value vector
v Q is the k g constraint coefcient matrix for the variables, where Q
i,}
is the coefcient of
variable in the i th constraint
v r is the k 1 side constraint right-hand-side vector
v m is the g 1 variable lower bound vector
v is the g 1 variable upper bound vector
The Interior Point Algorithm
The simplex algorithm, developed shortly after World War II, was for many years the main method
used to solve linear programming problems. Over the last fteen years, however, the interior point
algorithm has been developed. This algorithm also solves linear programming problems. From the
48 ! Chapter 4: The INTPOINT Procedure
start it showed great theoretical promise, and considerable research in the area resulted in practical
implementations that performed competitively with the simplex algorithm. More recently, interior
point algorithms have evolved to become superior to the simplex algorithm, in general, especially
when the problems are large.
There are many variations of interior point algorithms. PROC INTPOINT uses the Primal-Dual with
Predictor-Corrector algorithm. More information on this particular algorithm and related theory can
be found in the texts by Roos, Terlaky, and Vial (1997), Wright (1996), and Ye (1996).
Interior Point Algorithmic Details
After preprocessing, the linear program to be solved is
minimize c
T
.
subject to . = b
. _ 0
This is the primal problem. The matrices J, :, and Q of NPSC have been renamed c, ., and ,
respectively, as these symbols are by convention used more, the problem to be solved is different from
the original because of preprocessing, and there has been a change of primal variable to transform
the LP into one whose variables have zero lower bounds. To simplify the algebra here, assume that
variables have innite upper bounds, and constraints are equalities. (Interior point algorithms do
efciently handle nite upper bounds, and it is easy to introduce primal slack variables to change
inequalities into equalities.) The problem has n variables; i is a variable number; k is an iteration
number, and if used as a subscript or superscript it denotes of iteration k.
There exists an equivalent problem, the dual problem, stated as
maximize b
T
,
subject to
T
, s = c
s _ 0
where , are dual variables, and s are dual constraint slacks.
The interior point algorithm solves the system of equations to satisfy the Karush-Kuhn-Tucker (KKT)
conditions for optimality:
. = b

T
, s = c
XSe = 0
. _ 0
s _ 0
where
S = diag(s) (that is, S
i,}
= s
i
if i = . S
i,}
= 0 otherwise)
X = diag(.)
e
i
= 1 Vi
The Interior Point Algorithm ! 49
These are the conditions for feasibility, with the complementarity condition XSe = 0 added.
Complementarity forces the optimal objectives of the primal and dual to be equal, c
T
.
o]t
= b
T
,
o]t
,
as
0 = .
T
o]t
s
o]t
= s
T
o]t
.
o]t
= (c
T
,
o]t
)
T
.
o]t
=
c
T
.
o]t
,
T
o]t
(.
o]t
) = c
T
.
o]t
b
T
,
o]t
Before the optimum is reached, a solution (.. ,. s) may not satisfy the KKT conditions:
v Primal constraints may be violated, infeas
c
= b . = 0.
v Dual constraints may be violated, infeas
d
= c
T
, s = 0.
v Complementarity may not be satised, .
T
s = c
T
. b
T
, = 0. This is called the duality gap.
The interior point algorithm works by using Newtons method to nd a direction to move
(^.
k
. ^,
k
. ^s
k
) from the current solution (.
k
. ,
k
. s
k
) toward a better solution:
(.
k1
. ,
k1
. s
k1
) = (.
k
. ,
k
. s
k
) (^.
k
. ^,
k
. ^s
k
)
where is the step length and is assigned a value as large as possible but not so large that an .
k1
i
or s
k1
i
is too close to zero. The direction in which to move is found using
^.
k
= infeas
c

T
^,
k
^s
k
= infeas
d
S
k
^.
k
X
k
^s
k
= X
k
S
k
e
To greatly improve performance, the third equation is changed to
S
k
^.
k
X
k
^s
k
= X
k
S
k
e o
k
j
k
e
where j
k
= (.
k
)
T
s
k
,n, the average complementarity, and 0 _ o
k
_ 1.
The effect now is to nd a direction in which to move to reduce infeasibilities and to reduce the
complementarity toward zero, but if any .
k
i
s
k
i
is too close to zero, it is nudged out to j, and any
.
k
i
s
k
i
that is larger than j is nudged into j. A o
k
close to or equal to 0.0 biases a direction toward
the optimum, and a value of o
k
close to or equal to 1.0 centers the direction toward a point where
all pairwise products .
k
i
s
k
i
= j. Such points make up the central path in the interior. Although
centering directions make little, if any, progress in reducing j and moving the solution closer to the
optimum, substantial progress toward the optimum can usually be made in the next iteration.
The central path is crucial to why the interior point algorithm is so efcient. As j is decreased,
this path guides the algorithm to the optimum through the interior of feasible space. Without
centering, the algorithm would nd a series of solutions near each other close to the boundary of
feasible space. Step lengths along the direction would be small and many more iterations would
probably be required to reach the optimum.
50 ! Chapter 4: The INTPOINT Procedure
That in a nutshell is the primal-dual interior point algorithm. Varieties of the algorithm differ in the
way and o
k
are chosen and the direction adjusted during each iteration. A wealth of information
can be found in the texts by Roos, Terlaky, and Vial (1997), Wright (1996), and Ye (1996).
The calculation of the direction is the most time-consuming step of the interior point algorithm.
Assume the kth iteration is being performed, so the subscript and superscript k can be dropped from
the algebra:
^. = infeas
c

T
^, ^s = infeas
d
S^. X^s = XSe oje
Rearranging the second equation,
^s = infeas
d

T
^,
Rearranging the third equation,
^s = X
-1
(S^. XSe oje)
^s = ^. Se X
-1
oje
where = SX
-1
.
Equating these two expressions for ^s and rearranging,
^. Se X
-1
oje = infeas
d

T
^,
^. = Se X
-1
oje infeas
d

T
^,
^. =
-1
(Se X
-1
oje infeas
d

T
^,)
^. = j
-1

T
^,
where j =
-1
(Se X
-1
oje infeas
d
).
Substituting into the rst direction equation,
^. = infeas
c
(j
-1

T
^,) = infeas
c

-1

T
^, = infeas
c
j
^, = (
-1

T
)
-1
(infeas
c
j)
, j, ^,, ^., and ^s are calculated in that order. The hardest term is the factorization of the
(
-1

T
) matrix to determine ^,. Fortunately, although the values of (
-1

T
) are different
for each iteration, the locations of the nonzeros in this matrix remain xed; the nonzero locations are
the same as those in the matrix (
T
). This is because
-1
= XS
-1
is a diagonal matrix that has
the effect of merely scaling the columns of (
T
).
The Interior Point Algorithm ! 51
The fact that the nonzeros in
-1

T
have a constant pattern is exploited by all interior point
algorithms and is a major reason for their excellent performance. Before iterations begin,
T
is
examined and its rows and columns are symmetrically permuted so that during Cholesky factorization,
the number of ll-ins created is smaller. A list of arithmetic operations to perform the factorization
is saved in concise computer data structures (working with memory locations rather than actual
numerical values). This is called symbolic factorization. During iterations, when memory has been
initialized with numerical values, the operations list is performed sequentially. Determining how the
factorization should be performed again and again is unnecessary.
The Primal-Dual Predictor-Corrector Interior Point Algorithm
The variant of the interior point algorithm implemented in PROC INTPOINT is a Primal-Dual
Predictor-Corrector interior point algorithm. At rst, Newtons method is used to nd a direction
(^.
k
aff
. ^,
k
aff
. ^s
k
aff
) to move, but calculated as if j is zero, that is, as a step with no centering,
known as an afne step:
^.
k
aff
= infeas
c

T
^,
k
aff
^s
k
aff
= infeas
d
S
k
^.
k
aff
X
k
^s
k
aff
= X
k
S
k
e
(.
k
aff
. ,
k
aff
. s
k
aff
) = (.
k
. ,
k
. s
k
) (^.
k
aff
. ^,
k
aff
. ^s
k
aff
)
where is the step length as before.
Complementarity .
T
s is calculated at (.
k
aff
. ,
k
aff
. s
k
aff
) and compared with the complementarity at
the starting point (.
k
. ,
k
. s
k
), and the success of the afne step is gauged. If the afne step was
successful in reducing the complementarity by a substantial amount, the need for centering is not
great, and o
k
in the following linear system is assigned a value close to zero. If, however, the
afne step was unsuccessful, centering would be benecial, and o
k
in the following linear system
is assigned a value closer to 1.0. The value of o
k
is therefore adaptively altered depending on the
progress made toward the optimum.
A second linear system is solved to determine a centering vector (^.
k
c
. ^,
k
c
. ^s
k
c
) from
(.
k
aff
. ,
k
aff
. s
k
aff
):
^.
k
c
= 0

T
^,
k
c
^s
k
c
= 0
S
k
^.
k
c
X
k
^s
k
c
= X
k
aff
S
k
aff
e o
k
j
k
e
Then
(^.
k
. ^,
k
. ^s
k
) = (^.
k
aff
. ^,
k
aff
. ^s
k
aff
) (^.
k
c
. ^,
k
c
. ^s
k
c
)
(.
k1
. ,
k1
. s
k1
) = (.
k
. ,
k
. s
k
) (^.
k
. ^,
k
. ^s
k
)
where, as before, is the step length assigned a value as large as possible but not so large that an
.
k1
i
or s
k1
i
is too close to zero.
52 ! Chapter 4: The INTPOINT Procedure
Although the Predictor-Corrector variant entails solving two linear systems instead of one, fewer
iterations are usually required to reach the optimum. The additional overhead of calculating the
second linear system is small, as the factorization of the (
-1

T
) matrix has already been
performed to solve the rst linear system.
Interior Point: Upper Bounds
If the LP had upper bounds (0 _ . _ u where u is the upper bound vector), then the primal and dual
problems, the duality gap, and the KKT conditions would have to be expanded.
The primal linear program to be solved is
minimize c
T
.
subject to . = b
0 _ . _ u
where 0 _ . _ u is split into . _ 0 and . _ u. Let : be primal slack so that . : = u, and
associate dual variables n with these constraints. The interior point algorithm solves the system of
equations to satisfy the Karush-Kuhn-Tucker (KKT) conditions for optimality:
. = b
. : = u

T
, s n = c
XSe = 0
7We = 0
.. s. :. n _ 0
These are the conditions for feasibility, with the complementarity conditions XSe = 0 and 7We = 0
added. Complementarity forces the optimal objectives of the primal and dual to be equal, c
T
.
o]t
=
b
T
,
o]t
u
T
n
o]t
, as
0 = :
T
o]t
n
o]t
= (u .
o]t
)
T
n
o]t
= u
T
n
o]t
.
T
o]t
n
o]t
0 = .
T
o]t
s
o]t
= s
T
o]t
.
o]t
= (c
T
,
o]t
n
o]t
)
T
.
o]t
=
c
T
.
o]t
,
T
o]t
(.
o]t
) n
T
o]t
.
o]t
= c
T
.
o]t
b
T
,
o]t
u
T
n
o]t
Before the optimum is reached, a solution (.. ,. s. :. n) might not satisfy the KKT conditions:
v Primal bound constraints may be violated, infeas
b
= u . : = 0.
v Primal constraints may be violated, infeas
c
= b . = 0.
v Dual constraints may be violated, infeas
d
= c
T
, s n = 0.
v Complementarity conditions may not be satised, .
T
s = 0 or :
T
n = 0.
The Interior Point Algorithm ! 53
The calculations of the interior point algorithm can easily be derived in a fashion similar to cal-
culations for when an LP has no upper bounds. See the paper by Lustig, Marsten, and Shanno
(1992).
In some iteration k, the afne step system that must be solved is
^.
aff
^:
aff
= infeas
b
^.
aff
= infeas
c

T
^,
aff
^s
aff
^n
aff
= infeas
d
S^.
aff
X^s
aff
= XSe
7^n
aff
W^:
aff
= 7We
Therefore, the computations involved in solving the afne step are
= SX
-1
W7
-1
j =
-1
(infeas
d
(S W)e 7
-1
W infeas
b
)
^,
aff
= (
-1

T
)
-1
(infeas
c
j)
^.
aff
=
-1

T
^,
aff
j
^:
aff
= infeas
b
^.
aff
^n
aff
= We 7
-1
W^:
aff
^s
aff
= Se X
-1
S^.
aff
(.
aff
. ,
aff
. s
aff
. :
aff
. n
aff
) = (.. ,. s. :. n)
(^.
aff
. ^,
aff
. ^s
aff
. ^:
aff
. ^n
aff
)
and is the step length as before.
A second linear system is solved to determine a centering vector (^.
c
. ^,
c
. ^s
c
. ^:
c
. ^n
c
) from
(.
aff
. ,
aff
. s
aff
. :
aff
. n
aff
):
^.
c
^:
c
= 0
^.
c
= 0

T
^,
c
^s
c
^n
c
= 0
S^.
c
X^s
c
= X
aff
S
aff
e oje
7^n
c
W^:
c
= 7
aff
W
aff
e oje
where

start
= .
T
s :
T
n, complementarity at the start of the iteration

aff
= .
T
aff
s
aff
:
T
aff
n
aff
, the afne complementarity
j =
aff
,2n, the average complementarity
o = (
aff
,
start
)
3
54 ! Chapter 4: The INTPOINT Procedure
Therefore, the computations involved in solving the centering step are
j =
-1
(oj(X
-1
7
-1
)e X
-1
X
aff
S
aff
e 7
-1
7
aff
W
aff
e)
^,
c
= (
-1

T
)
-1
j
^.
c
=
-1

T
^,
c
j
^:
c
= ^.
c
^n
c
= oj7
-1
e 7
-1
7
aff
W
aff
e 7
-1
W
aff
^:
c
^s
c
= ojX
-1
e X
-1
X
aff
S
aff
e X
-1
S
aff
^.
c
Then
(^.. ^,. ^s. ^:. ^n) =
(^.
aff
. ^,
aff
. ^s
aff
. ^:
aff
. ^n
aff
)
(^.
c
. ^,
c
. ^s
c
. ^:
c
. ^n
c
)
(.
k1
. ,
k1
. s
k1
. :
k1
. n
k1
) =
(.
k
. ,
k
. s
k
. :
k
. n
k
)
(^.. ^,. ^s. ^:. ^n)
where, as before, is the step length assigned a value as large as possible but not so large that an
.
k1
i
, s
k1
i
, :
k1
i
, or n
k1
i
is too close to zero.
The algebra in this section has been simplied by assuming that all variables have nite upper
bounds. If the number of variables with nite upper bounds n
u
< n, you need to change the algebra
to reect that the 7 and W matrices have dimension n
u
1 or n
u
n
u
. Other computations need
slight modication. For example, the average complementarity is
j = .
T
aff
s
aff
,n :
T
aff
n
aff
,n
u
An important point is that any upper bounds can be handled by specializing the algorithm and not by
generating the constraints . _ u and adding these to the main primal constraints . = b.
Network Models ! 55
Network Models
The following are descriptions of some typical NPSC models.
Production, Inventory, and Distribution (Supply Chain) Problems
One common class of network models is the production-inventory-distribution or supply-chain
problem. The diagram in Figure 4.1 illustrates this problem. The subscripts on the Production,
Inventory, and Sales nodes indicate the time period. By replicating sections of the model, the notion
of time can be included.
Figure 4.1 Production-Inventory-Distribution Problem
Sales
i-1
Sales
i
Sales
i1
Inventory
i-1
Inventory
i
Inventory
i1
Production
i-1
Production
i
Production
i1
Stock on hand Stock at end
In this type of model, the nodes can represent a wide variety of facilities. Several examples are
suppliers, spot markets, importers, farmers, manufacturers, factories, parts of a plant, production
lines, waste disposal facilities, workstations, warehouses, coolstores, depots, wholesalers, export
markets, ports, rail junctions, airports, road intersections, cities, regions, shops, customers, and
consumers. The diversity of this selection demonstrates how rich the potential applications of this
model are.
Depending upon the interpretation of the nodes, the objectives of the modeling exercise can vary
widely. Some common types of objectives are
v to reduce collection or purchase costs of raw materials
v to reduce inventory holding or backorder costs. Warehouses and other storage facilities
sometimes have capacities, and there can be limits on the amount of goods that can be placed
on backorder.
v to decide where facilities should be located and what the capacity of these should be. Network
models have been used to help decide where factories, hospitals, ambulance and re stations,
oil and water wells, and schools should be sited.
56 ! Chapter 4: The INTPOINT Procedure
v to determine the assignment of resources (machines, production capability, workforce) to tasks,
schedules, classes, or les
v to determine the optimal distribution of goods or services. This usually means minimizing
transportation costs and reducing transit time or distances covered.
v to nd the shortest path from one location to another
v to ensure that demands (for example, production requirements, market demands, contractual
obligations) are met
v to maximize prots from the sale of products or the charge for services
v to maximize production by identifying bottlenecks
Some specic applications are
v car distribution models. These help determine which models and numbers of cars should be
manufactured in which factories and where to distribute cars from these factories to zones in
the United States in order to meet customer demand at least cost.
v models in the timber industry. These help determine when to plant and mill forests, schedule
production of pulp, paper, and wood products, and distribute products for sale or export.
v military applications. The nodes can be theaters, bases, ammunition dumps, logistical suppliers,
or radar installations. Some models are used to nd the best ways to mobilize personnel and
supplies and to evacuate the wounded in the least amount of time.
v communications applications. The nodes can be telephone exchanges, transmission lines,
satellite links, and consumers. In a model of an electrical grid, the nodes can be transformers,
powerstations, watersheds, reservoirs, dams, and consumers. The effect of high loads or
outages might be of concern.
Proportionality Constraints
In many models, you have the characteristic that a ow through an arc must be proportional to
the ow through another arc. Side constraints are often necessary to model that situation. Such
constraints are called proportionality constraints and are useful in models where production is
subject to rening or modication into different materials. The amount of each output, or any waste,
evaporation, or reduction can be specied as a proportion of input.
Typically, the arcs near the supply nodes carry raw materials and the arcs near the demand nodes
carry rened products. For example, in a model of the milling industry, the ow through some arcs
may represent quantities of wheat. After the wheat is processed, the ow through other arcs might be
our. For others it might be bran. The side constraints model the relationship between the amount of
our or bran produced as a proportion of the amount of wheat milled. Some of the wheat can end up
as neither our, bran, nor any useful product, so this waste is drained away via arcs to a waste node.
Network Models ! 57
Figure 4.2 Proportionality Constraints
Wheat Mill
Flour
Bran
Other
100 20
30
50
In order for arcs to be specied in side constraints, they must be named. By default, PROC INTPOINT
names arcs using the names of the nodes at the head and tail of the arc. An arc is named with its tail
node name followed by an underscore and its head node name. For example, an arc from node from
to node to is called from_to.
Consider the network fragment in Figure 4.2. The arc Wheat_Mill conveys the wheat milled. The
cost of ow on this arc is the milling cost. The capacity of this arc is the capacity of the mill. The
lower ow bound on this arc is the minimum quantity that must be milled for the mill to operate
economically. The constraints
0.3 Wheat_Mill Mill_Flour = 0.0
0.2 Wheat_Mill Mill_Bran = 0.0
force every unit of wheat that is milled to produce 0.3 units of our and 0.2 units of bran. Note that
it is not necessary to specify the constraint
0.5 Wheat_Mill Mill_Other = 0.0
since ow conservation implies that any ow that does not traverse through Mill_Flour or Mill_Bran
must be conveyed through Mill_Other. And, computationally, it is better if this constraint is not
specied, since there is one less side constraint and fewer problems with numerical precision. Notice
that the sum of the proportions must equal 1.0 exactly; otherwise, ow conservation is violated.
Blending Constraints
Blending or quality constraints can also inuence the recipes or proportions of ingredients that are
mixed. For example, different raw materials can have different properties. In an application of the
oil industry, the amount of products that are obtained could be different for each type of crude oil.
Furthermore, fuel might have a minimum octane requirement or limited sulphur or lead content, so
that a blending of crudes is needed to produce the product.
58 ! Chapter 4: The INTPOINT Procedure
The network fragment in Figure 4.3 shows an example of this.
Figure 4.3 Blending Constraints
USA
MidEast
Port Renery
Gasoline
Diesel
Other
5 units/
liter
4 units/
liter
4.75 units/
liter
The arcs MidEast_Port and USA_Port convey crude oil from the two sources. The arc Port_Renery
represents rening while the arcs Renery_Gasoline and Renery_Diesel carry the gas and diesel
produced. The proportionality constraints
0.4 Port_Renery Renery_Gasoline = 0.0
0.2 Port_Renery Renery_Diesel = 0.0
capture the restrictions for producing gasoline and diesel from crude. Suppose that only crude from
the Middle East is used, then the resulting diesel would contain 5 units of sulphur per liter. If only
crude from the U.S.A. is used, the resulting diesel would contain 4 units of sulphur per liter. Diesel
can have at most 4.75 units of sulphur per liter. Some crude from the U.S.A. must be used if Middle
East crude is used in order to meet the 4.75 sulphur per liter limit. The side constraint to model this
requirement is
5 MidEast_Port 4 USA_Port 4.75 Port_Renery _ 0.0
Since Port_Renery = MidEast_Port USA_Port, ow conservation allows this constraint to be
simplied to
1 MidEast_Port 3 USA_Port _ 0.0
If, for example, 120 units of crude from the Middle East is used, then at least 40 units of crude from
the U.S.A. must be used. The preceding constraint is simplied because you assume that the sulphur
concentration of diesel is proportional to the sulphur concentration of the crude mix. If this is not the
case, the relation
0.2 Port_Renery =Renery_Diesel
Network Models ! 59
is used to obtain
5 MidEast_Port 4 USA_Port 4.75 (1.0,0.2 Renery_Diesel) _ 0.0
which equals
5 MidEast_Port 4 USA_Port 23.75 Renery_Diesel _ 0.0
An example similar to this oil industry problem is solved in the section Introductory NPSC Example
on page 64.
Multicommodity Problems
Side constraints are also used in models in which there are capacities on transportation or some
other shared resource, or there are limits on overall production or demand in multicommodity,
multidivisional, or multiperiod problems. Each commodity, division, or period can have a separate
network coupled to one main system by the side constraints. Side constraints are used to combine
the outputs of subdivisions of a problem (either commodities, outputs in distinct time periods, or
different process streams) to meet overall demands or to limit overall production or expenditures.
This method is more desirable than doing separate local optimizations for individual commodity,
process, or time networks and then trying to establish relationships between each when determining
an overall policy if the global constraint is not satised. Of course, to make models more realistic,
side constraints may be necessary in the local problems.
Figure 4.4 Multicommodity Problem
Factorycom2
Factorycom1
City2com2
City1com2
City2com1
City1com1
Commodity 1
Commodity 2
60 ! Chapter 4: The INTPOINT Procedure
Figure 4.4 shows two network fragments. They represent identical production and distribution sites
of two different commodities. Sufx com1 represents commodity 1 and sufx com2 represents
commodity 2. The nodes Factorycom1 and Factorycom2 model the same factory, and nodes City1com1
and City1com2 model the same location, city 1. Similarly, City2com1 and City2com2 are the same
location, city 2. Suppose that commodity 1 occupies 2 cubic meters, commodity 2 occupies 3 cubic
meters, the truck dispatched to city 1 has a capacity of 200 cubic meters, and the truck dispatched to
city 2 has a capacity of 250 cubic meters. How much of each commodity can be loaded onto each
truck? The side constraints for this case are
2 Factorycom1_City1com1 3 Factorycom2_City1com2 _ 200
2 Factorycom1_City2com1 3 Factorycom2_City2com2 _ 250
Large Modeling Strategy
In many cases, the ow through an arc might actually represent the ow or movement of a commodity
from place to place or from time period to time period. However, sometimes an arc is included in the
network as a method of capturing some aspect of the problem that you would not normally think of
as part of a network model. There is no commodity movement associated with that arc. For example,
in a multiprocess, multiproduct model (Figure 4.5), there might be subnetworks for each process and
each product. The subnetworks can be joined together by a set of arcs that have ows that represent
the amount of product produced by process i . To model an upper-limit constraint on the total
amount of product that can be produced, direct all arcs carrying product to a single node and
from there through a single arc. The capacity of this arc is the upper limit of product production. It
is preferable to model this structure in the network rather than to include it in the side constraints
because the efciency of the optimizer may be less affected by a reasonable increase in the size of
the network rather than increasing the number or complicating side constraints.
Figure 4.5 Multiprocess, Multiproduct Example
Capacity of
Process 200
Process 200 subnetwork
Capacity of
Process 100
Process 100 subnetwork
Capacity is upper limit of
Product 200 production
Product 200 subnetwork
Capacity is upper limit of
Product 100 production
Product 100 subnetwork
Network Models ! 61
When starting a project, it is often a good strategy to use a small network formulation and then use
that model as a framework upon which to add detail. For example, in the multiprocess, multiproduct
model, you might start with the network depicted in Figure 4.5. Then, for example, the process
subnetwork can be enhanced to include the distribution of products. Other phases of the operation
could be included by adding more subnetworks. Initially, these subnetworks can be single nodes, but
in subsequent studies they can be expanded to include greater detail.
Advantages of Network Models over LP Models
Many linear programming problems have large embedded network structures. Such problems often
result when modeling manufacturing processes, transportation or distribution networks, or resource
allocation, or when deciding where to locate facilities. Often, some commodity is to be moved from
place to place, so the more natural formulation in many applications is that of a constrained network
rather than a linear program.
Using a network diagram to visualize a problem makes it possible to capture the important relation-
ships in an easily understood picture form. The network diagram aids the communication between
model builder and model user, making it easier to comprehend how the model is structured, how it
can be changed, and how results can be interpreted.
If a network structure is embedded in a linear program, the problem is an NPSC (see the section
Mathematical Description of NPSC on page 45). When the network part of the problem is
large compared to the nonnetwork part, especially if the number of side constraints is small, it is
worthwhile to exploit this structure to describe the model. Rather than generating the data for the
ow conservation constraints, generate instead the data for the nodes and arcs of the network.
Flow Conservation Constraints
The constraints J. = b in NPSC (see the section Mathematical Description of NPSC on page 45)
are referred to as the nodal ow conservation constraints. These constraints algebraically state that
the sum of the ow through arcs directed toward a node plus that nodes supply, if any, equals the
sum of the ow through arcs directed away from that node plus that nodes demand, if any. The ow
conservation constraints are implicit in the network model and should not be specied explicitly in
side constraint data when using PROC INTPOINT to solve NPSC problems.
Nonarc Variables
Nonarc variables can be used to simplify side constraints. For example, if a sum of ows appears in
many constraints, it may be worthwhile to equate this expression with a nonarc variable and use this
in the other constraints. This keeps the constraint coefcient matrix sparse. By assigning a nonarc
variable a nonzero objective function, it is then possible to incur a cost for using resources above
some lowest feasible limit. Similarly, a prot (a negative objective function coefcient value) can be
made if all available resources are not used.
62 ! Chapter 4: The INTPOINT Procedure
In some models, nonarc variables are used in constraints to absorb excess resources or supply needed
resources. Then, either the excess resource can be used or the needed resource can be supplied to
another component of the model.
For example, consider a multicommodity problem of making television sets that have either 19- or
25-inch screens. In their manufacture, three and four chips, respectively, are used. Production occurs
at two factories during March and April. The supplier of chips can supply only 2,600 chips to factory
1 and 3,750 chips to factory 2 each month. The names of arcs are in the form Prodn_s_m, where n
is the factory number, s is the screen size, and m is the month. For example, Prod1_25_Apr is the
arc that conveys the number of 25-inch TVs produced in factory 1 during April. You might have to
determine similar systematic naming schemes for your application.
As described, the constraints are
3 Prod1_19_Mar 4 Prod1_25_Mar _ 2600
3 Prod2_19_Mar 4 Prod2_25_Mar _ 3750
3 Prod1_19_Apr 4 Prod1_25_Apr _ 2600
3 Prod2_19_Apr 4 Prod2_25_Apr _ 3750
If there are chips that could be obtained for use in March but not used for production in March, why
not keep these unused chips until April? Furthermore, if the March excess chips at factory 1 could
be used either at factory 1 or factory 2 in April, the model becomes
3 Prod1_19_Mar 4 Prod1_25_Mar F1_Unused_Mar = 2600
3 Prod2_19_Mar 4 Prod2_25_Mar F2_Unused_Mar = 3750
3 Prod1_19_Apr 4 Prod1_25_Apr F1_Kept_Since_Mar = 2600
3 Prod2_19_Apr 4 Prod2_25_Apr F2_Kept_Since_Mar = 3750
F1_Unused_Mar F2_Unused_Mar (continued)
F1_Kept_Since_Mar F2_Kept_Since_Mar _ 0.0
where F1_Kept_Since_Mar is the number of chips used during April at factory 1 that were obtained in
March at either factory 1 or factory 2, and F2_Kept_Since_Mar is the number of chips used during
April at factory 2 that were obtained in March. The last constraint ensures that the number of chips
used during April that were obtained in March does not exceed the number of chips not used in
March. There may be a cost to hold chips in inventory. This can be modeled having a positive
objective function coefcient for the nonarc variables F1_Kept_Since_Mar and F2_Kept_Since_Mar.
Moreover, nonarc variable upper bounds represent an upper limit on the number of chips that can be
held in inventory between March and April.
See Example 4.1 through Example 4.5, which use this TV problem. The use of nonarc variables as
described previously is illustrated.
Introduction ! 63
Introduction
Getting Started: NPSC Problems
To solve NPSC problems using PROC INTPOINT, you save a representation of the network and
the side constraints in three SAS data sets. These data sets are then passed to PROC INTPOINT
for solution. There are various forms that a problems data can take. You can use any one or a
combination of several of these forms.
The NODEDATA= data set contains the names of the supply and demand nodes and the supply or
demand associated with each. These are the elements in the column vector b in the NPSC problem
(see the section Mathematical Description of NPSC on page 45).
The ARCDATA= data set contains information about the variables of the problem. Usually these are
arcs, but there can also be data related to nonarc variables in the ARCDATA= data set.
An arc is identied by the names of its tail node (where it originates) and head node (where it is
directed). Each observation can be used to identify an arc in the network and, optionally, the cost per
ow unit across the arc, the arcs capacity, lower ow bound, and name. These data are associated
with the matrix J and the vectors c, l, and u in the NPSC problem (see the section Mathematical
Description of NPSC on page 45).
NOTE: Although J is a node-arc incidence matrix, it is specied in the ARCDATA= data set by
arc denitions. Do not explicitly specify these ow conservation constraints as constraints of the
problem.
In addition, the ARCDATA= data set can be used to specify information about nonarc variables,
including objective function coefcients, lower and upper value bounds, and names. These data
are the elements of the vectors J, m, and in the NPSC problem (see the section Mathematical
Description of NPSC on page 45). Data for an arc or nonarc variable can be given in more than one
observation.
Supply and demand data also can be specied in the ARCDATA= data set. In such a case, the
NODEDATA= data set may not be needed.
The CONDATA= data set describes the side constraints and their right-hand sides. These data are
elements of the matrices H and Q and the vector r. Constraint types are also specied in the
CONDATA= data set. You can include in this data set upper bound values or capacities, lower ow
or value bounds, and costs or objective function coefcients. It is possible to give all information
about some or all nonarc variables in the CONDATA= data set.
An arc is identied in this data set by its name. If you specify an arcs name in the ARCDATA= data
set, then this name is used to associate data in the CONDATA= data set with that arc. Each arc also
has a default name that is the name of the tail and head node of the arc concatenated together and
separated by an underscore character; tail_head, for example.
64 ! Chapter 4: The INTPOINT Procedure
If you use the dense side constraint input format (described in the section CONDATA= Data Set
on page 110), and want to use the default arc names, these arc names are names of SAS variables in
the VAR list of the CONDATA= data set.
If you use the sparse side constraint input format (see the section CONDATA= Data Set on
page 110) and want to use the default arc names, these arc names are values of the COLUMN list
variable of the CONDATA= data set.
PROC INTPOINT reads the data from the NODEDATA= data set, the ARCDATA= data set, and the
CONDATA= data set. Error checking is performed, and the model is converted into an equivalent
LP. This LP is preprocessed. Preprocessing is optional but highly recommended. Preprocessing
analyzes the model and tries to determine before optimization whether variables can be xed to
their optimal values. Knowing that, the model can be modied and these variables dropped out.
It can be determined that some constraints are redundant. Sometimes, preprocessing succeeds in
reducing the size of the problem, thereby making the subsequent optimization easier and faster.
The optimal solution to the equivalent LP is then found. This LP is converted back to the original
NPSC problem, and the optimum for this is derived from the optimum of the equivalent LP. If the
problem was preprocessed, the model is now post-processed, where xed variables are reintroduced.
The solution can be saved in the CONOUT= data set.
Introductory NPSC Example
Consider the following transshipment problem for an oil company. Crude oil is shipped to reneries
where it is processed into gasoline and diesel fuel. The gasoline and diesel fuel are then distributed
to service stations. At each stage, there are shipping, processing, and distribution costs. Also, there
are lower ow bounds and capacities.
In addition, there are two sets of side constraints. The rst set is that two times the crude from the
Middle East cannot exceed the throughput of a renery plus 15 units. (The phrase plus 15 units that
nishes the last sentence is used to enable some side constraints in this example to have a nonzero
rhs.) The second set of constraints are necessary to model the situation that one unit of crude mix
processed at a renery yields three-fourths of a unit of gasoline and one-fourth of a unit of diesel
fuel.
Because there are two products that are not independent in the way in which they ow through the
network, an NPSC is an appropriate model for this example (see Figure 4.6). The side constraints
are used to model the limitations on the amount of Middle Eastern crude that can be processed by
each renery and the conversion proportions of crude to gasoline and diesel fuel.
Getting Started: NPSC Problems ! 65
Figure 4.6 Oil Industry Example
u.s.a. renery2
middle east
renery1
r2
r1
ref2 diesel
ref2 gas
ref1 diesel
ref1 gas
servstn2
diesel
servstn2
gas
servstn1
diesel
servstn1
gas
To solve this problem with PROC INTPOINT, save a representation of the model in three SAS data
sets. In the NODEDATA= data set, you name the supply and demand nodes and give the associated
supplies and demands. To distinguish demand nodes from supply nodes, specify demands as negative
quantities. For the oil example, the NODEDATA= data set can be saved as follows:
title 'Oil Industry Example';
title3 'Setting Up Nodedata = Noded For PROC INTPOINT';
data noded;
input _node_&$15. _sd_;
datalines;
middle east 100
u.s.a. 80
servstn1 gas -95
servstn1 diesel -30
servstn2 gas -40
servstn2 diesel -15
;
The ARCDATA= data set contains the rest of the information about the network. Each observation
in the data set identies an arc in the network and gives the cost per ow unit across the arc, the
capacities of the arc, the lower bound on ow across the arc, and the name of the arc.
66 ! Chapter 4: The INTPOINT Procedure
title3 'Setting Up Arcdata = Arcd1 For PROC INTPOINT';
data arcd1;
input _from_&$11. _to_&$15. _cost_ _capac_ _lo_ _name_ $;
datalines;
middle east refinery 1 63 95 20 m_e_ref1
middle east refinery 2 81 80 10 m_e_ref2
u.s.a. refinery 1 55 . . .
u.s.a. refinery 2 49 . . .
refinery 1 r1 200 175 50 thruput1
refinery 2 r2 220 100 35 thruput2
r1 ref1 gas . 140 . r1_gas
r1 ref1 diesel . 75 . .
r2 ref2 gas . 100 . r2_gas
r2 ref2 diesel . 75 . .
ref1 gas servstn1 gas 15 70 . .
ref1 gas servstn2 gas 22 60 . .
ref1 diesel servstn1 diesel 18 . . .
ref1 diesel servstn2 diesel 17 . . .
ref2 gas servstn1 gas 17 35 5 .
ref2 gas servstn2 gas 31 . . .
ref2 diesel servstn1 diesel 36 . . .
ref2 diesel servstn2 diesel 23 . . .
;
Finally, the CONDATA= data set contains the side constraints for the model:
title3 'Setting Up Condata = Cond1 For PROC INTPOINT';
data cond1;
input m_e_ref1 m_e_ref2 thruput1 r1_gas thruput2 r2_gas
_type_ $ _rhs_;
datalines;
-2 . 1 . . . >= -15
. -2 . . 1 . GE -15
. . -3 4 . . EQ 0
. . . . -3 4 = 0
;
Note that the SAS variable names in the CONDATA= data set are the names of arcs given in the
ARCDATA= data set. These are the arcs that have nonzero constraint coefcients in side constraints.
For example, the proportionality constraint that species that one unit of crude at each renery yields
three-fourths of a unit of gasoline and one-fourth of a unit of diesel fuel is given for renery 1 in
the third observation and for renery 2 in the last observation. The third observation requires that
each unit of ow on the arc thruput1 equals three-fourths of a unit of ow on the arc r1_gas. Because
all crude processed at renery 1 ows through thruput1 and all gasoline produced at renery 1 ows
through r1_gas, the constraint models the situation. It proceeds similarly for renery 2 in the last
observation.
To nd the minimum cost ow through the network that satises the supplies, demands, and side
constraints, invoke PROC INTPOINT as follows:
Getting Started: NPSC Problems ! 67
proc intpoint
bytes=1000000
nodedata=noded /
*
the supply and demand data
*
/
arcdata=arcd1 /
*
the arc descriptions
*
/
condata=cond1 /
*
the side constraints
*
/
conout=solution; /
*
the solution data set
*
/
run;
The following messages, which appear on the SAS log, summarize the model as read by PROC
INTPOINT and note the progress toward a solution.
NOTE: Number of nodes= 14 .
NOTE: Number of supply nodes= 2 .
NOTE: Number of demand nodes= 4 .
NOTE: Total supply= 180 , total demand= 180 .
NOTE: Number of arcs= 18 .
NOTE: Number of <= side constraints= 0 .
NOTE: Number of == side constraints= 2 .
NOTE: Number of >= side constraints= 2 .
NOTE: Number of side constraint coefficients= 8 .
NOTE: The following messages relate to the equivalent Linear Programming
problem solved by the Interior Point algorithm.
NOTE: Number of <= constraints= 0 .
NOTE: Number of == constraints= 16 .
NOTE: Number of >= constraints= 2 .
NOTE: Number of constraint coefficients= 44 .
NOTE: Number of variables= 18 .
NOTE: After preprocessing, number of <= constraints= 0.
NOTE: After preprocessing, number of == constraints= 3.
NOTE: After preprocessing, number of >= constraints= 2.
NOTE: The preprocessor eliminated 13 constraints from the problem.
NOTE: The preprocessor eliminated 33 constraint coefficients from the problem.
NOTE: After preprocessing, number of variables= 5.
NOTE: The preprocessor eliminated 13 variables from the problem.
NOTE: 4 columns, 0 rows and 4 coefficients were added to the problem to handle
unrestricted variables, variables that are split, and constraint slack or
surplus variables.
NOTE: There are 10 sub-diagonal nonzeroes in the unfactored A Atranspose matrix.
NOTE: The 5 factor nodes make up 1 supernodes
NOTE: There are 0 nonzero sub-rows or sub-columns outside the supernodal
triangular regions along the factors leading diagonal.
NOTE: Bound feasibility attained by iteration 1.
NOTE: Dual feasibility attained by iteration 1.
NOTE: Constraint feasibility attained by iteration 1.
NOTE: The Primal-Dual Predictor-Corrector Interior Point algorithm performed 6
iterations.
NOTE: Optimum reached.
NOTE: Objective= 50875.
NOTE: The data set WORK.SOLUTION has 18 observations and 10 variables.
NOTE: There were 18 observations read from the data set WORK.ARCD1.
NOTE: There were 6 observations read from the data set WORK.NODED.
NOTE: There were 4 observations read from the data set WORK.COND1.
68 ! Chapter 4: The INTPOINT Procedure
The rst set of messages shows the size of the problem. The next set of messages provides statistics
on the size of the equivalent LP problem. The number of variables may not equal the number of arcs
if the problem has nonarc variables. This example has none. To convert a network to the equivalent
LP problem, a ow conservation constraint must be created for each node (including an excess or
bypass node, if required). This explains why the number of equality constraints and the number
of constraint coefcients differ from the number of equality side constraints and the number of
coefcients in all side constraints.
If the preprocessor was successful in decreasing the problem size, some messages will report how
well it did. In this example, the model size was cut approximately in half!
The next set of messages describes aspects of the interior point algorithm. Of particular interest are
those concerned with the Cholesky factorization of
T
where is the coefcient matrix of the
nal LP. It is crucial to preorder the rows and columns of this matrix to prevent ll-in and reduce the
number of row operations to undertake the factorization. See the section Interior Point Algorithmic
Details on page 48 for a more extensive explanation.
Unlike PROC LP, which displays the solution and other information as output, PROC INTPOINT
saves the optimum in the output SAS data set that you specify. For this example, the solution is saved
in the SOLUTION data set. It can be displayed with the PRINT procedure as
title3 'Optimum';
proc print data=solution;
var _from_ _to_ _cost_ _capac_ _lo_ _name_
_supply_ _demand_ _flow_ _fcost_;
sum _fcost_;
run;
Getting Started: NPSC Problems ! 69
Figure 4.7 CONOUT=SOLUTION
Oil Industry Example
Optimum
_ _
_ S D _
_ _ c _ U E _ F
f c a n P M F C
r _ o p _ a P A L O
O o t s a l m L N O S
b m o t c o e Y D W T
s _ _ _ _ _ _ _ _ _ _
1 refinery 1 r1 200 175 50 thruput1 . . 145.000 29000.00
2 refinery 2 r2 220 100 35 thruput2 . . 35.000 7700.00
3 r1 ref1 diesel 0 75 0 . . 36.250 0.00
4 r1 ref1 gas 0 140 0 r1_gas . . 108.750 0.00
5 r2 ref2 diesel 0 75 0 . . 8.750 0.00
6 r2 ref2 gas 0 100 0 r2_gas . . 26.250 0.00
7 middle east refinery 1 63 95 20 m_e_ref1 100 . 80.000 5040.00
8 u.s.a. refinery 1 55 99999999 0 80 . 65.000 3575.00
9 middle east refinery 2 81 80 10 m_e_ref2 100 . 20.000 1620.00
10 u.s.a. refinery 2 49 99999999 0 80 . 15.000 735.00
11 ref1 diesel servstn1 diesel 18 99999999 0 . 30 30.000 540.00
12 ref2 diesel servstn1 diesel 36 99999999 0 . 30 0.000 0.00
13 ref1 gas servstn1 gas 15 70 0 . 95 68.750 1031.25
14 ref2 gas servstn1 gas 17 35 5 . 95 26.250 446.25
15 ref1 diesel servstn2 diesel 17 99999999 0 . 15 6.250 106.25
16 ref2 diesel servstn2 diesel 23 99999999 0 . 15 8.750 201.25
17 ref1 gas servstn2 gas 22 60 0 . 40 40.000 880.00
18 ref2 gas servstn2 gas 31 99999999 0 . 40 0.000 0.00
========
50875.00
Notice that, in CONOUT=SOLUTION (Figure 4.7), the optimal ow through each arc in the network
is given in the variable named _FLOW_, and the cost of ow through each arc is given in the variable
_FCOST_.
70 ! Chapter 4: The INTPOINT Procedure
Figure 4.8 Oil Industry Solution
u.s.a. renery2
middle east
renery1
r2
r1
ref2 diesel
ref2 gas
ref1 diesel
ref1 gas
servstn2
diesel
servstn2
gas
servstn1
diesel
servstn1
gas
80
100
15
80
20
65
35
145
8.75
26.25
36.25
108.75
68.75
8.75
30
40 26.25
6.25
95
30
40
15
Getting Started: LP Problems
Data for an LP problem resembles the data for side constraints and nonarc variables supplied to
PROC INTPOINT when solving an NPSC problem. It is also very similar to the data required by the
LP procedure.
To solve LP problems using PROC INTPOINT, you save a representation of the LP variables and
the constraints in one or two SAS data sets. These data sets are then passed to PROC INTPOINT
for solution. There are various forms that a problems data can take. You can use any one or a
combination of several of these forms.
The ARCDATA= data set contains information about the LP variables of the problem. Although this
data set is called ARCDATA, it contains data for no arcs. Instead, all data in this data set are related
to LP variables. This data set has no SAS variables containing values that are node names.
The ARCDATA= data set can be used to specify information about LP variables, including objective
function coefcients, lower and upper value bounds, and names. These data are the elements of
the vectors J, m, and in problem ( LP). Data for an LP variable can be given in more than one
observation.
Getting Started: LP Problems ! 71
The CONDATA= data set describes the constraints and their right-hand sides. These data are elements
of the matrix Q and the vector r.
Constraint types are also specied in the CONDATA= data set. You can include in this data set LP
variable data such as upper bound values, lower value bounds, and objective function coefcients. It
is possible to give all information about some or all LP variables in the CONDATA= data set.
Because PROC INTPOINT evolved from PROC NETFLOW, another procedure in SAS/OR software
that was originally designed to solve models with networks, the ARCDATA= data set is always
expected. If the ARCDATA= data set is not specied, by default the last data set created before PROC
INTPOINT is invoked is assumed to be the ARCDATA= data set. However, these characteristics of
PROC INTPOINT are not helpful when an LP problem is being solved and all data is provided in a
single data set specied by the CONDATA= data set, and that data set is not the last data set created
before PROC INTPOINT starts. In this case, you must specify that the ARCDATA= data set and the
CONDATA= data set are both equal to the input data set. PROC INTPOINT then knows that an LP
problem is to be solved and that the data reside in one data set.
An LP variable is identied in this data set by its name. If you specify an LP variables name in the
ARCDATA= data set, then this name is used to associate data in the CONDATA= data set with that
LP variable.
If you use the dense constraint input format (described in the section CONDATA= Data Set on
page 110), these LP variable names are names of SAS variables in the VAR list of the CONDATA=
data set.
If you use the sparse constraint input format (described in the section CONDATA= Data Set on
page 110), these LP variable names are values of the SAS variables in the COLUMN list of the
CONDATA= data set.
PROC INTPOINT reads the data from the ARCDATA= data set (if there is one) and the CONDATA=
data set. Error checking is performed, and the LP is preprocessed. Preprocessing is optional
but highly recommended. The preprocessor analyzes the model and tries to determine before
optimization whether LP variables can be xed to their optimal values. Knowing that, the
model can be modied and these LP variables dropped out. Some constraints may be found to be
redundant. Sometimes, preprocessing succeeds in reducing the size of the problem, thereby making
the subsequent optimization easier and faster.
The optimal solution is then found for the resulting LP. If the problem was preprocessed, the model
is now post-processed, where xed LP variables are reintroduced. The solution can be saved in the
CONOUT= data set.
72 ! Chapter 4: The INTPOINT Procedure
Introductory LP Example
Consider the linear programming problem in the section An Introductory Example on page 175.
The SAS data set in that section is created the same way here:
title 'Linear Programming Example';
title3 'Setting Up Condata = dcon1 For PROC INTPOINT';
data dcon1;
input _id_ $17.
a_light a_heavy brega naphthal naphthai
heatingo jet_1 jet_2
_type_ $ _rhs_;
datalines;
profit -175 -165 -205 0 0 0 300 300 max .
naphtha_l_conv .035 .030 .045 -1 0 0 0 0 eq 0
naphtha_i_conv .100 .075 .135 0 -1 0 0 0 eq 0
heating_o_conv .390 .300 .430 0 0 -1 0 0 eq 0
recipe_1 0 0 0 0 .3 .7 -1 0 eq 0
recipe_2 0 0 0 .2 0 .8 0 -1 eq 0
available 110 165 80 . . . . . upperbd .
;
To solve this problem, use
proc intpoint
bytes=1000000
condata=dcon1
conout=solutn1;
run;
Note how it is possible to use an input SAS data set of PROC LP and, without requiring any changes
to be made to the data set, to use that as an input data set for PROC INTPOINT.
The following messages that appear on the SAS log summarize the model as read by PROC
INTPOINT and note the progress toward a solution
NOTE: Number of variables= 8 .
NOTE: Number of <= constraints= 0 .
NOTE: Number of == constraints= 5 .
NOTE: Number of >= constraints= 0 .
NOTE: Number of constraint coefficients= 18 .
NOTE: After preprocessing, number of <= constraints= 0.
NOTE: After preprocessing, number of == constraints= 0.
NOTE: After preprocessing, number of >= constraints= 0.
NOTE: The preprocessor eliminated 5 constraints from the problem.
NOTE: The preprocessor eliminated 18 constraint coefficients from the problem.
NOTE: After preprocessing, number of variables= 0.
NOTE: The preprocessor eliminated 8 variables from the problem.
NOTE: The optimum has been determined by the Preprocessor.
NOTE: Objective= 1544.
NOTE: The data set WORK.SOLUTN1 has 8 observations and 6 variables.
NOTE: There were 7 observations read from the data set WORK.DCON1.
Getting Started: LP Problems ! 73
Notice that the preprocessor succeeded in xing all LP variables to their optimal values, eliminating
the need to do any actual optimization.
Unlike PROC LP, which displays the solution and other information as output, PROC INTPOINT
saves the optimum in the output SAS data set you specify. For this example, the solution is saved in
the SOLUTION data set. It can be displayed with PROC PRINT as
title3 'LP Optimum';
proc print data=solutn1;
var _name_ _objfn_ _upperbd _lowerbd _value_ _fcost_;
sum _fcost_;
run;
Notice that in the CONOUT=SOLUTION (Figure 4.9) the optimal value through each variable in
the LP is given in the variable named _VALUE_, and that the cost of value for each variable is given
in the variable _FCOST_.
Figure 4.9 CONOUT=SOLUTN1
Linear Programming Example
LP Optimum
Obs _NAME_ _OBJFN_ _UPPERBD _LOWERBD _VALUE_ _FCOST_
1 a_heavy -165 165 0 0.00 0
2 a_light -175 110 0 110.00 -19250
3 brega -205 80 0 80.00 -16400
4 heatingo 0 99999999 0 77.30 0
5 jet_1 300 99999999 0 60.65 18195
6 jet_2 300 99999999 0 63.33 18999
7 naphthai 0 99999999 0 21.80 0
8 naphthal 0 99999999 0 7.45 0
=======
1544
The same model can be specied in the sparse format as in the following scon2 data set. This format
enables you to omit the zero coefcients.
title3 'Setting Up Condata = scon2 For PROC INTPOINT';
data scon2;
format _type_ $8. _col_ $8. _row_ $16.;
input _type_ $ _col_ $ _row_ $ _coef_;
datalines;
max . profit .
eq . napha_l_conv .
eq . napha_i_conv .
eq . heating_oil_conv .
eq . recipe_1 .
eq . recipe_2 .
upperbd . available .
. a_light profit -175
. a_light napha_l_conv .035
. a_light napha_i_conv .100
74 ! Chapter 4: The INTPOINT Procedure
. a_light heating_oil_conv .390
. a_light available 110
. a_heavy profit -165
. a_heavy napha_l_conv .030
. a_heavy napha_i_conv .075
. a_heavy heating_oil_conv .300
. a_heavy available 165
. brega profit -205
. brega napha_l_conv .045
. brega napha_i_conv .135
. brega heating_oil_conv .430
. brega available 80
. naphthal napha_l_conv -1
. naphthal recipe_2 .2
. naphthai napha_i_conv -1
. naphthai recipe_1 .3
. heatingo heating_oil_conv -1
. heatingo recipe_1 .7
. heatingo recipe_2 .8
. jet_1 profit 300
. jet_1 recipe_1 -1
. jet_2 profit 300
. jet_2 recipe_2 -1
;
To nd the minimum cost solution, invoke PROC INTPOINT (note the SPARSECONDATA option
which must be specied) as follows:
proc intpoint
bytes=1000000
sparsecondata
condata=scon2
conout=solutn2;
run;
A data set that can be used as the ARCDATA= data set can be initialized as follows:
data vars3;
input _name_ $ profit available;
datalines;
a_heavy -165 165
a_light -175 110
brega -205 80
heatingo 0 .
jet_1 300 .
jet_2 300 .
naphthai 0 .
naphthal 0 .
;
The following CONDATA= data set is the original dense format CONDATA= dcon1 data set after
the LP variables nonconstraint information has been removed. (You could have left some or all of
that information in CONDATA as PROC INTPOINT merges data, but doing that and checking for
consistency takes time.)
Getting Started: LP Problems ! 75
data dcon3;
input _id_ $17.
a_light a_heavy brega naphthal naphthai
heatingo jet_1 jet_2
_type_ $ _rhs_;
datalines;
naphtha_l_conv .035 .030 .045 -1 0 0 0 0 eq 0
naphtha_i_conv .100 .075 .135 0 -1 0 0 0 eq 0
heating_o_conv .390 .300 .430 0 0 -1 0 0 eq 0
recipe_1 0 0 0 0 .3 .7 -1 0 eq 0
recipe_2 0 0 0 .2 0 .8 0 -1 eq 0
;
NOTE: You must now specify the MAXIMIZE option; otherwise, PROC INTPOINT will optimize to
the minimum (which, incidentally, has a total objective = -3539.25). You must indicate that the SAS
variable prot in the ARCDATA=vars3 data set has values that are objective function coefcients, by
specifying the OBJFN statement. The UPPERBD must be specied as the SAS variable available
that has as values upper bounds:
proc intpoint
maximize /
* *****
necessary
***** *
/
bytes=1000000
arcdata=vars3
condata=dcon3
conout=solutn3;
objfn profit;
upperbd available;
run;
The ARCDATA=vars3 data set can become more concise by noting that the model variables heatingo,
naphthai, and naphthal have zero objective function coefcients (the default) and default upper bounds,
so those observations need not be present:
data vars4;
input _name_ $ profit available;
datalines;
a_heavy -165 165
a_light -175 110
brega -205 80
jet_1 300 .
jet_2 300 .
;
The CONDATA=dcon3 data set can become more concise by noting that all the constraints have
the same type (eq) and zero (the default) rhs values. This model is a good candidate for using the
DEFCONTYPE= option.
The DEFCONTYPE= option can be useful not only when all constraints have the same type as is the
case here, but also when most constraints have the same type and you want to change the default
type from _ to =or _. The essential constraint type data in the CONDATA= data set is that which
overrides the DEFCONTYPE= type you specied.
76 ! Chapter 4: The INTPOINT Procedure
data dcon4;
input _id_ $17.
a_light a_heavy brega naphthal naphthai
heatingo jet_1 jet_2;
datalines;
naphtha_l_conv .035 .030 .045 -1 0 0 0 0
naphtha_i_conv .100 .075 .135 0 -1 0 0 0
heating_o_conv .390 .300 .430 0 0 -1 0 0
recipe_1 0 0 0 0 .3 .7 -1 0
recipe_2 0 0 0 .2 0 .8 0 -1
;
proc intpoint
maximize defcontype=eq
bytes=1000000
arcdata=vars3
condata=dcon3
conout=solutn3;
objfn profit;
upperbd available;
run;
Here are several different ways of using the ARCDATA= data set and a sparse format CONDATA=
data set for this LP. The following CONDATA= data set is the result of removing the prot and
available data from the original sparse format CONDATA=scon2 data set.
data scon5;
format _type_ $8. _col_ $8. _row_ $16. ;
input _type_ $ _col_ $ _row_ $ _coef_;
datalines;
eq . napha_l_conv .
eq . napha_i_conv .
eq . heating_oil_conv .
eq . recipe_1 .
eq . recipe_2 .
. a_light napha_l_conv .035
. a_light napha_i_conv .100
. a_light heating_oil_conv .390
. a_heavy napha_l_conv .030
. a_heavy napha_i_conv .075
. a_heavy heating_oil_conv .300
. brega napha_l_conv .045
. brega napha_i_conv .135
. brega heating_oil_conv .430
. naphthal napha_l_conv -1
. naphthal recipe_2 .2
. naphthai napha_i_conv -1
. naphthai recipe_1 .3
. heatingo heating_oil_conv -1
. heatingo recipe_1 .7
. heatingo recipe_2 .8
. jet_1 recipe_1 -1
. jet_2 recipe_2 -1
;
Getting Started: LP Problems ! 77
proc intpoint
maximize
bytes=1000000
sparsecondata
arcdata=vars3 /
*
or arcdata=vars4
*
/
condata=scon5
conout=solutn5;
objfn profit;
upperbd available;
run;
The CONDATA=scon5 data set can become more concise by noting that all the constraints have the
same type (eq) and zero (the default) rhs values. Use the DEFCONTYPE= option again. Once the
rst ve observations of the CONDATA=scon5 data set are removed, the _type_ variable has values
that are missing in all of the remaining observations. Therefore, this variable can be removed.
data scon6;
input _col_ $ _row_&$16. _coef_;
datalines;
a_light napha_l_conv .035
a_light napha_i_conv .100
a_light heating_oil_conv .390
a_heavy napha_l_conv .030
a_heavy napha_i_conv .075
a_heavy heating_oil_conv .300
brega napha_l_conv .045
brega napha_i_conv .135
brega heating_oil_conv .430
naphthal napha_l_conv -1
naphthal recipe_2 .2
naphthai napha_i_conv -1
naphthai recipe_1 .3
heatingo heating_oil_conv -1
heatingo recipe_1 .7
heatingo recipe_2 .8
jet_1 recipe_1 -1
jet_2 recipe_2 -1
;
proc intpoint
maximize
bytes=1000000
defcontype=eq
sparsecondata
arcdata=vars4
condata=scon6
conout=solutn6;
objfn profit;
upperbd available;
run;
78 ! Chapter 4: The INTPOINT Procedure
Typical PROC INTPOINT Run
You start PROC INTPOINT by giving the PROC INTPOINT statement. You can specify many
options in the PROC INTPOINT statement to control the procedure, or you can rely on default
settings and specify very few options. However, there are some options you must specify:
v You must specify the BYTES= parameter indicating the size of the working memory that the
procedure is allowed to use. This option has no default.
v In many instances (and certainly when solving NPSC problems), you need to specify the
ARCDATA= data set. This option has a default (which is the SAS data set that was created
last before PROC INTPOINT began running), but that may need to be overridden.
v The CONDATA= data set must also be specied if the problem is NPSC and has side con-
straints, or if it is an LP problem.
v When solving a network problem, you have to specify the NODEDATA= data set, if some
model data is given in such a data set.
Some options, while optional, are frequently required. To have the optimal solution output to a
SAS data set, you have to specify the CONOUT= data set. You may want to indicate reasons
why optimization should stop (for example, you can indicate the maximum number of iterations
that can be performed), or you might want to alter stopping criteria so that optimization does not
stop prematurely. Some options enable you to control other aspects of the interior point algorithm.
Specifying certain values for these options can reduce the time it takes to solve a problem.
The SAS variable lists should be given next. If you have SAS variables in the input data sets that
have special names (for example, a SAS variable in the ARCDATA= data set named _TAIL_ that has
tail nodes of arcs as values), it may not be necessary to have many or any variable lists. If you do
not specify a TAIL variable list, PROC INTPOINT will search the ARCDATA= data set for a SAS
variable named _TAIL_.
What usually follows is a RUN statement, which indicates that all information that you, the user,
need to supply to PROC INTPOINT has been given, and the procedure is to start running. This also
happens if you specify a statement in your SAS program that PROC INTPOINT does not recognize
as one of its own, the next DATA step or procedure.
The QUIT statement indicates that PROC INTPOINT must immediately nish.
For example, a PROC INTPOINT run might look something like this:
proc intpoint
bytes= /
*
working memory size
*
/
arcdata= /
*
data set
*
/
condata= /
*
data set
*
/
/
*
other options
*
/
;
variable list specifications; /
*
if necessary
*
/
run; /
*
start running, read data,
*
/
/
*
and do the optimization.
*
/
Syntax: INTPOINT Procedure ! 79
Syntax: INTPOINT Procedure
Below are statements used in PROC INTPOINT, listed in alphabetical order as they appear in the
text that follows.
PROC INTPOINT options ;
CAPACITY variable ;
COEF variables ;
COLUMN variable ;
COST variable ;
DEMAND variable ;
HEADNODE variable ;
ID variables ;
LO variable ;
NAME variable ;
NODE variable ;
QUIT ; ;
RHS variable ;
ROW variables ;
RUN ; ;
SUPDEM variable ;
SUPPLY variable ;
TAILNODE variable ;
TYPE variable ;
VAR variables ;
Functional Summary
Table 4.1 outlines the options that can be specied in the INTPOINT procedure. All options are
specied in the PROC INTPOINT statement.
Table 4.1 Functional Summary
Description Statement Option
Input Data Set Options:
Arcs input data set PROC INTPOINT ARCDATA=
Nodes input data set PROC INTPOINT NODEDATA=
Constraint input data set PROC INTPOINT CONDATA=
Output Data Set Options:
Constrained solution data set PROC INTPOINT CONOUT=
Convert sparse or dense format input data set
into MPS-format output data set
PROC INTPOINT MPSOUT=
80 ! Chapter 4: The INTPOINT Procedure
Description Statement Option
Data Set Read Options:
CONDATA has sparse data format PROC INTPOINT SPARSECONDATA
Default constraint type PROC INTPOINT DEFCONTYPE=
Special COLUMN variable value PROC INTPOINT TYPEOBS=
Special COLUMN variable value PROC INTPOINT RHSOBS=
Used to interpret arc and variable names PROC INTPOINT NAMECTRL=
No nonarc data in ARCDATA PROC INTPOINT ARCS_ONLY_ARCDATA
Data for an arc found once in ARCDATA PROC INTPOINT ARC_SINGLE_OBS
Data for a constraint found once in CONDATA PROC INTPOINT CON_SINGLE_OBS
Data for a coefcient found once in CONDATA PROC INTPOINT NON_REPLIC=
Data is grouped, exploited during data read PROC INTPOINT GROUPED=
Problem Size Specication Options:
Approximate number of nodes PROC INTPOINT NNODES=
Approximate number of arcs PROC INTPOINT NARCS=
Approximate number of variables PROC INTPOINT NNAS=
Approximate number of coefcients PROC INTPOINT NCOEFS=
Approximate number of constraints PROC INTPOINT NCONS=
Network Options:
Default arc cost, objective function coefcient PROC INTPOINT DEFCOST=
Default arc capacity, variable upper bound PROC INTPOINT DEFCAPACITY=
Default arc ow and variable lower bound PROC INTPOINT DEFMINFLOW=
Networks only supply node PROC INTPOINT SOURCE=
SOURCEs supply capability PROC INTPOINT SUPPLY=
Networks only demand node PROC INTPOINT SINK=
SINKs demand PROC INTPOINT DEMAND=
Convey excess supply/demand through network PROC INTPOINT THRUNET
Find max ow between SOURCE and SINK PROC INTPOINT MAXFLOW
Cost of bypass arc, MAXFLOW problem PROC INTPOINT BYPASSDIVIDE=
Find shortest path from SOURCE to SINK PROC INTPOINT SHORTPATH
Interior Point Algorithm Options:
Factorization method PROC INTPOINT FACT_METHOD=
Allowed amount of dual infeasibility PROC INTPOINT TOLDINF=
Allowed amount of primal infeasibility PROC INTPOINT TOLPINF=
Allowed total amount of dual infeasibility PROC INTPOINT TOLTOTDINF=
Allowed total amount of primal infeasibility PROC INTPOINT TOLTOTPINF=
Cut-off tolerance for Cholesky factorization PROC INTPOINT CHOLTINYTOL=
Density threshold for Cholesky processing PROC INTPOINT DENSETHR=
Step-length multiplier PROC INTPOINT PDSTEPMULT=
Preprocessing type PROC INTPOINT PRSLTYPE=
Print optimization progress on SAS log PROC INTPOINT PRINTLEVEL2=
Ratio test zero tolerance PROC INTPOINT RTTOL=
PROC INTPOINT Statement ! 81
Description Statement Option
Interior Point Algorithm Stopping Criteria:
maximum number of interior point iterations PROC INTPOINT MAXITERB=
primal-dual (duality) gap tolerance PROC INTPOINT PDGAPTOL=
Stop because of complementarity PROC INTPOINT STOP_C=
Stop because of duality gap PROC INTPOINT STOP_DG=
Stop because of infeas
b
PROC INTPOINT STOP_IB=
Stop because of infeas
c
PROC INTPOINT STOP_IC=
Stop because of infeas
d
PROC INTPOINT STOP_ID=
Stop because of complementarity PROC INTPOINT AND_STOP_C=
Stop because of duality gap PROC INTPOINT AND_STOP_DG=
Stop because of infeas
b
PROC INTPOINT AND_STOP_IB=
Stop because of infeas
c
PROC INTPOINT AND_STOP_IC=
Stop because of infeas
d
PROC INTPOINT AND_STOP_ID=
Stop because of complementarity PROC INTPOINT KEEPGOING_C=
Stop because of duality gap PROC INTPOINT KEEPGOING_DG=
Stop because of infeas
b
PROC INTPOINT KEEPGOING_IB=
Stop because of infeas
c
PROC INTPOINT KEEPGOING_IC=
Stop because of infeas
d
PROC INTPOINT KEEPGOING_ID=
Stop because of complementarity PROC INTPOINT AND_KEEPGOING_C=
Stop because of duality gap PROC INTPOINT AND_KEEPGOING_DG=
Stop because of infeas
b
PROC INTPOINT AND_KEEPGOING_IB=
Stop because of infeas
c
PROC INTPOINT AND_KEEPGOING_IC=
Stop because of infeas
d
PROC INTPOINT AND_KEEPGOING_ID=
Memory Control Options:
Issue memory usage messages to SAS log PROC INTPOINT MEMREP
Number of bytes to use for main memory PROC INTPOINT BYTES=
Miscellaneous Options:
Innity value PROC INTPOINT INFINITY=
Maximization instead of minimization PROC INTPOINT MAXIMIZE
Zero tolerance - optimization PROC INTPOINT ZERO2=
Zero tolerance - real number comparisons PROC INTPOINT ZEROTOL=
Suppress similar SAS log messages PROC INTPOINT VERBOSE=
Scale problem data PROC INTPOINT SCALE=
Write optimization time to SAS log PROC INTPOINT OPTIM_TIMER
PROC INTPOINT Statement
PROC INTPOINT options ;
This statement invokes the procedure. The following options can be specied in the PROCINTPOINT
statement.
82 ! Chapter 4: The INTPOINT Procedure
Data Set Options
This section briey describes all the input and output data sets used by PROC INTPOINT. The
ARCDATA= data set, the NODEDATA= data set, and the CONDATA= data set can contain SAS
variables that have special names, for instance _CAPAC_, _COST_, and _HEAD_. PROC INTPOINT
looks for such variables if you do not give explicit variable list specications. If a SAS variable with
a special name is found and that SAS variable is not in another variable list specication, PROC
INTPOINT determines that values of the SAS variable are to be interpreted in a special way. By using
SAS variables that have special names, you may not need to have any variable list specications.
ARCDATA=SAS-data-set
names the data set that contains arc and, optionally, nonarc variable information and nodal
supply/demand data. The ARCDATA= data set must be specied in all PROC INTPOINT
statements when solving NPSC problems.
If your problem is an LP, the ARCDATA= data set is optional. You can specify LP variable
information such as objective function coefcients, and lower and upper bounds.
CONDATA=SAS-data-set
names the data set that contains the side constraint data. The data set can also contain other
data such as arc costs, capacities, lower ow bounds, nonarc variable upper and lower bounds,
and objective function coefcients. PROC INTPOINT needs a CONDATA= data set to
solve a constrained problem. See the section CONDATA= Data Set on page 110 for more
information.
If your problem is an LP, this data set contains the constraint data, and can also contain other
data such as objective function coefcients, and lower and upper bounds. PROC INTPOINT
needs a CONDATA= data set to solve an LP.
CONOUT=SAS-data-set
COUT=SAS-data-set
names the output data set that receives an optimal solution. See the section CONOUT= Data
Set on page 118 for more information.
If PROC INTPOINT is outputting observations to the output data set and you want this to stop,
press the keys used to stop SAS procedures.
MPSOUT=SAS-data-set
names the SAS data set that contains converted sparse or dense format input data in MPS
format. Invoking this option directs the INTPOINT procedure to halt before attempting
optimization. For more information about the MPSOUT= option, see the section Converting
Any PROC INTPOINT Format to an MPS-Format SAS Data Set on page 120. For more
information about the MPS-format SAS data set, see Chapter 16, The MPS-Format SAS Data
Set.
NODEDATA=SAS-data-set
names the data set that contains the node supply and demand specications. You do not need
observations in the NODEDATA= data set for transshipment nodes. (Transshipment nodes
neither supply nor demand ow.) All nodes are assumed to be transshipment nodes unless
PROC INTPOINT Statement ! 83
supply or demand data indicate otherwise. It is acceptable for some arcs to be directed toward
supply nodes or away from demand nodes.
This data set is used only when you are solving network problems (not when solving LP
problems), in which case the use of the NODEDATA= data set is optional provided that, if the
NODEDATA= data set is not used, supply and demand details are specied by other means.
Other means include using the MAXFLOW or SHORTPATH option, SUPPLY or DEMAND
variable list (or both) in the ARCDATA= data set, and the SOURCE=, SUPPLY=, SINK=, or
DEMAND= option in the PROC INTPOINT statement.
General Options
The following is a list of options you can use with PROC INTPOINT. The options are listed in
alphabetical order.
ARCS_ONLY_ARCDATA
indicates that data for arcs only are in the ARCDATA= data set. When PROC INTPOINT
reads the data in the ARCDATA= data set, memory would not be wasted to receive data for
nonarc variables. The read might then be performed faster. See the section How to Make the
Data Read of PROC INTPOINT More Efcient on page 129.
ARC_SINGLE_OBS
indicates that for all arcs and nonarc variables, data for each arc or nonarc variable is found in
only one observation of the ARCDATA= data set. When reading the data in the ARCDATA=
data set, PROC INTPOINT knows that the data in an observation is for an arc or a nonarc
variable that has not had data previously read and that needs to be checked for consistency.
The read might then be performed faster.
When solving an LP, specifying the ARC_SINGLE_OBS option indicates that for all LP
variables, data for each LP variable is found in only one observation of the ARCDATA= data
set. When reading the data in the ARCDATA= data set, PROC INTPOINT knows that the data
in an observation is for an LP variable that has not had data previously read and that needs to
be checked for consistency. The read might then be performed faster.
If you specify ARC_SINGLE_OBS, PROC INTPOINT automatically works as if
GROUPED=ARCDATA is also specied.
See the section How to Make the Data Read of PROC INTPOINT More Efcient on
page 129.
BYPASSDIVIDE=b
BYPASSDIV=b
BPD=b
should be used only when the MAXFLOW option has been specied; that is, PROC INTPOINT
is solving a maximal ow problem. PROC INTPOINT prepares to solve maximal ow
problems by setting up a bypass arc. This arc is directed from the SOURCE= to the SINK=
and will eventually convey ow equal to INFINITY minus the maximal ow through the
network. The cost of the bypass arc must be great enough to drive ow through the network,
rather than through the bypass arc. Also, the cost of the bypass arc must be greater than the
84 ! Chapter 4: The INTPOINT Procedure
eventual total cost of the maximal ow, which can be nonzero if some network arcs have
nonzero costs. The cost of the bypass is set to the value of the INFINITY= option. Valid values
for the BYPASSDIVIDE= option must be greater than or equal to 1.1.
If there are no nonzero costs of arcs in the MAXFLOW problem, the cost of the bypass arc is
set to 1.0 (-1.0 if maximizing) if you do not specify the BYPASSDIVIDE= option. The default
value for the BYPASSDIVIDE= option (in the presence of nonzero arc costs) is 100.0.
BYTES=b
indicates the size of the main working memory (in bytes) that PROC INTPOINT will allocate.
Specifying this option is mandatory. The working memory is used to store all the arrays and
buffers used by PROC INTPOINT. If this memory has a size smaller than what is required to
store all arrays and buffers, PROC INTPOINT uses various schemes that page information
between auxiliary memory (often your machines disk) and RAM.
For small problems, specify BYTES=100000. For large problems (those with hundreds of
thousands or millions of variables), BYTES=1000000 might do. For solving problems of that
size, if you are running on a machine with an inadequate amount of RAM, PROC INTPOINTs
performance will suffer since it will be forced to page or to rely on virtual memory.
If you specify the MEMREP option, PROC INTPOINT will issue messages on the SAS log
informing you of its memory usage; that is, how much memory is required to prevent paging,
and details about the amount of paging that must be performed, if applicable.
CON_SINGLE_OBS
improves how the CONDATA= data set is read. How it works depends on whether the
CONDATA has a dense or sparse format.
If the CONDATA= data set has the dense format, specifying CON_SINGLE_OBS indicates
that, for each constraint, data for each can be found in only one observation of the CONDATA=
data set.
If the CONDATA= data set has a sparse format, and data for each arc, nonarc variable,
or LP variable can be found in only one observation of the CONDATA, then specify the
CON_SINGLE_OBS option. If there are n SAS variables in the ROW and COEF list, then
each arc or nonarc can have at most n constraint coefcients in the model. See the section
How to Make the Data Read of PROC INTPOINT More Efcient on page 129.
DEFCAPACITY=c
DC=c
requests that the default arc capacity and the default nonarc variable value upper bound (or for
LP problems, the default LP variable value upper bound) be c. If this option is not specied,
then DEFCAPACITY= INFINITY.
DEFCONTYPE=c
DEFTYPE=c
DCT=c
species the default constraint type. This default constraint type is either less than or equal to
or is the type indicated by DEFCONTYPE=c. Valid values for this option are
PROC INTPOINT Statement ! 85
LE, le, or <= for less than or equal to
EQ, eq, or = for equal to
GE, ge, or >= for greater than or equal to
The values do not need to be enclosed in quotes.
DEFCOST=c
requests that the default arc cost and the default nonarc variable objective function coefcient
(or for an LP, the default LP variable objective function coefcient) be c. If this option is not
specied, then DEFCOST=0.0.
DEFMINFLOW=m
DMF=m
requests that the default lower ow bound through arcs and the default lower value bound of
nonarc variables (or for an LP, the default lower value bound of LP variables) be m. If a value
is not specied, then DEFMINFLOW=0.0.
DEMAND=d
species the demand at the SINK node specied by the SINK= option. The DEMAND= option
should be used only if the SINK= option is given in the PROC INTPOINT statement and
neither the SHORTPATH option nor the MAXFLOW option is specied. If you are solving a
minimum cost network problem and the SINK= option is used to identify the sink node, and
the DEMAND= option is not specied, then the demand at the sink node is made equal to the
networks total supply.
GROUPED=grouped
PROC INTPOINT can take a much shorter time to read data if the data have been grouped prior
to the PROC INTPOINT call. This enables PROC INTPOINT to conclude that, for instance, a
new NAME list variable value seen in the ARCDATA= data set grouped by the values of the
NAME list variable before PROC INTPOINT was called is new. PROC INTPOINT does not
need to check that the NAME has been read in a previous observation. See the section How
to Make the Data Read of PROC INTPOINT More Efcient on page 129.
v GROUPED=ARCDATA indicates that the ARCDATA= data set has been grouped by
values of the NAME list variable. If _NAME_ is the name of the NAME list variable, you
could use
proc sort data=arcdata; by _name_;
prior to calling PROC INTPOINT. Technically, you do not have to sort the data, only
to ensure that all similar values of the NAME list variable are grouped together. If you
specify the ARCS_ONLY_ARCDATA option, PROC INTPOINT automatically works
as if GROUPED=ARCDATA is also specied.
v GROUPED=CONDATA indicates that the CONDATA= data set has been grouped.
If the CONDATA= data set has a dense format, GROUPED=CONDATA indicates that
the CONDATA= data set has been grouped by values of the ROW list variable. If _ROW_
is the name of the ROW list variable, you could use
proc sort data=condata; by _row_;
86 ! Chapter 4: The INTPOINT Procedure
prior to calling PROC INTPOINT. Technically, you do not have to sort the data, only to
ensure that all similar values of the ROW list variable are grouped together. If you specify
the CON_SINGLE_OBS option, or if there is no ROW list variable, PROC INTPOINT
automatically works as if GROUPED=CONDATA has been specied.
If the CONDATA= data set has the sparse format, GROUPED=CONDATA indicates that
CONDATA has been grouped by values of the COLUMN list variable. If _COL_ is the
name of the COLUMN list variable, you could use
proc sort data=condata; by _col_;
prior to calling PROC INTPOINT. Technically, you do not have to sort the data, only to
ensure that all similar values of the COLUMN list variable are grouped together.
v GROUPED=BOTHindicates that both GROUPED=ARCDATAand GROUPED=CONDATA
are TRUE.
v GROUPED=NONE indicates that the data sets have not been grouped, that is, nei-
ther GROUPED=ARCDATA nor GROUPED=CONDATA is TRUE. This is the de-
fault, but it is much better if GROUPED=ARCDATA, or GROUPED=CONDATA, or
GROUPED=BOTH.
A data set like
... _XXXXX_ ....
bbb
bbb
aaa
ccc
ccc
is a candidate for the GROUPED= option. Similar values are grouped together. When PROC
INTPOINT is reading the i th observation, either the value of the _XXXXX_ variable is the same
as the (i 1)st (that is, the previous observations) _XXXXX_ value, or it is a new _XXXXX_
value not seen in any previous observation. This also means that if the i th _XXXXX_ value is
different from the (i 1)st _XXXXX_ value, the value of the (i 1)st _XXXXX_ variable will
not be seen in any observations i. i 1. . . . .
INFINITY=i
INF=i
is the largest number used by PROC INTPOINT in computations. A number too small can
adversely affect the solution process. You should avoid specifying an enormous value for the
INFINITY= option because numerical roundoff errors can result. If a value is not specied,
then INFINITY=99999999. The INFINITY= option cannot be assigned a value less than 9999.
MAXFLOW
MF
species that PROC INTPOINT solve a maximum ow problem. In this case, the PROC
INTPOINT procedure nds the maximum ow from the node specied by the SOURCE=
option to the node specied by the SINK= option. PROC INTPOINT automatically assigns an
INFINITY= option supply to the SOURCE= option node and the SINK= option is assigned
PROC INTPOINT Statement ! 87
the INFINITY= option demand. In this way, the MAXFLOW option sets up a maximum ow
problem as an equivalent minimum cost problem.
You can use the MAXFLOW option when solving any ow problem (not necessarily a
maximum ow problem) when the network has one supply node (with innite supply) and
one demand node (with innite demand). The MAXFLOW option can be used in conjunction
with all other options (except SHORTPATH, SUPPLY=, and DEMAND=) and capabilities of
PROC INTPOINT.
MAXIMIZE
MAX
species that PROC INTPOINT nd the maximum cost ow through the network. If both the
MAXIMIZE and the SHORTPATH options are specied, the solution obtained is the longest
path between the SOURCE= and SINK= nodes. Similarly, MAXIMIZE and MAXFLOW
together cause PROC INTPOINT to nd the minimum ow between these two nodes; this is
zero if there are no nonzero lower ow bounds. If solving an LP, specifying the MAXIMIZE
option is necessary if you want the maximal optimal solution found instead of the minimal
optimum.
MEMREP
indicates that information on the memory usage and paging schemes (if necessary) is reported
by PROC INTPOINT on the SAS log.
NAMECTRL=i
is used to interpret arc and nonarc variable names in the CONDATA= data set. In the ARC-
DATA= data set, an arc is identied by its tail and head node. In the CONDATA= data set, arcs
are identied by names. You can give a name to an arc by having a NAME list specication
that indicates a SAS variable in the ARCDATA= data set that has names of arcs as values.
PROC INTPOINT requires that arcs that have information about them in the CONDATA= data set
have names, but arcs that do not have information about them in the CONDATA= data set can also
have names. Unlike a nonarc variable whose name uniquely identies it, an arc can have several
different names. An arc has a default name in the form tail_head, that is, the name of the arcs tail
node followed by an underscore and the name of the arcs head node.
In the CONDATA= data set, if the dense data format is used (described in the section CONDATA=
Data Set on page 110), a name of an arc or a nonarc variable is the name of a SAS variable listed in
the VAR list specication. If the sparse data format of the CONDATA= data set is used, a name of an
arc or a nonarc variable is a value of the SAS variable listed in the COLUMN list specication.
The NAMECTRL= option is used when a name of an arc or a nonarc variable in the CONDATA=
data set (either a VAR list variable name or a value of the COLUMN list variable) is in the form
tail_head and there exists an arc with these end nodes. If tail_head has not already been tagged as
belonging to an arc or nonarc variable in the ARCDATA= data set, PROC INTPOINT needs to know
whether tail_head is the name of the arc or the name of a nonarc variable.
If you specify NAMECTRL=1, a name that is not dened in the ARCDATA= data set is assumed to
be the name of a nonarc variable. NAMECTRL=2 treats tail_head as the name of the arc with these
endnodes, provided no other name is used to associate data in the CONDATA= data set with this
arc. If the arc does have other names that appear in the CONDATA= data set, tail_head is assumed
88 ! Chapter 4: The INTPOINT Procedure
to be the name of a nonarc variable. If you specify NAMECTRL=3, tail_head is assumed to be a
name of the arc with these end nodes, whether the arc has other names or not. The default value of
NAMECTRL is 3.
If the dense format is used for the CONDATA= data set, there are two circumstances that affect how
this data set is read:
1. if you are running SAS Version 6, or a previous version to that, or if you are running SAS
Version 7 onward and you specify
options validvarname=v6;
in your SAS session. Lets refer to this as case 1.
2. if you are running SAS Version 7 onward and you do not specify
options validvarname=v6;
in your SAS session. Lets refer to this as case 2.
For case 1, the SAS System converts SAS variable names in a SAS program to uppercase. The VAR
list variable names are uppercased. Because of this, PROC INTPOINT automatically uppercases
names of arcs and nonarc variables or LP variables (the values of the NAME list variable) in the
ARCDATA= data set. The names of arcs and nonarc variables or LP variables (the values of the
NAME list variable) appear uppercased in the CONOUT= data set.
Also, if the dense format is used for the CONDATA= data set, be careful with default arc names
(names in the form tailnode_headnode). Node names (values in the TAILNODE and HEADNODE
list variables) in the ARCDATA= data set are not automatically uppercased by PROC INTPOINT.
Consider the following statements:
data arcdata;
input _from_ $ _to_ $ _name $ ;
datalines;
from to1 .
from to2 arc2
TAIL TO3 .
;
data densecon;
input from_to1 from_to2 arc2 tail_to3;
datalines;
2 3 3 5
;
proc intpoint
arcdata=arcdata condata=densecon;
run;
PROC INTPOINT Statement ! 89
The SAS System does not uppercase character string values within SAS data sets. PROC INTPOINT
never uppercases node names, so the arcs in observations 1, 2, and 3 in the preceding ARCDATA=
data set have the default names from_to1, from_to2, and TAIL_TO3, respectively. When the dense
format of the CONDATA= data set is used, PROC INTPOINT does uppercase values of the NAME
list variable, so the name of the arc in the second observation of the ARCDATA= data set is ARC2.
Thus, the second arc has two names: its default from_to2 and the other that was specied ARC2.
As the SAS System uppercases program code, you must think of the input statement
input from_to1 from_to2 arc2 tail_to3;
as really being
INPUT FROM_TO1 FROM_TO2 ARC2 TAIL_TO3;
The SAS variables named FROM_TO1 and FROM_TO2 are not associated with any of the arcs in the
preceding ARCDATA= data set. The values FROM_TO1 and FROM_TO2 are different from all of the
arc names from_to1, from_to2, TAIL_TO3, and ARC2. FROM_TO1 and FROM_TO2 could end up being
the names of two nonarc variables.
The SAS variable named ARC2 is the name of the second arc in the ARCDATA= data set, even
though the name specied in the ARCDATA= data set looks like arc2. The SAS variable named
TAIL_TO3 is the default name of the third arc in the ARCDATA= data set.
For case 2, the SAS System does not convert SAS variable names in a SAS program to uppercase.
The VAR list variable names are not uppercased. PROC INTPOINT does not automatically uppercase
names of arcs and nonarc variables or LP variables (the values of the NAME list variable) in the
ARCDATA= data set. PROC INTPOINT does not uppercase any SAS variable names, data set
values, or indeed anything. Therefore, PROC INTPOINT respects case, and characters in the data if
compared must have the right case if you mean them to be the same. Note how the input statement in
the data step that initialized the data set densecon below is specied in the following code:
data arcdata;
input _from_ $ _to_ $ _name $ ;
datalines;
from to1 .
from to2 arc2
TAIL TO3 .
;
data densecon;
input from_to1 from_to2 arc2 TAIL_TO3;
datalines;
2 3 3 5
;
proc intpoint
arcdata=arcdata condata=densecon;
run;
90 ! Chapter 4: The INTPOINT Procedure
NARCS=n
species the approximate number of arcs. See the section How to Make the Data Read of
PROC INTPOINT More Efcient on page 129.
NCOEFS=n
species the approximate number of constraint coefcients. See the section How to Make the
Data Read of PROC INTPOINT More Efcient on page 129.
NCONS=n
species the approximate number of constraints. See the section How to Make the Data Read
of PROC INTPOINT More Efcient on page 129.
NNAS=n
species the approximate number of nonarc variables. See the section How to Make the Data
Read of PROC INTPOINT More Efcient on page 129.
NNODES=n
species the approximate number of nodes. See the section How to Make the Data Read of
PROC INTPOINT More Efcient on page 129.
NON_REPLIC=non_replic
prevents PROC INTPOINT from doing unnecessary checks of data previously read.
v NON_REPLIC=COEFS indicates that each constraint coefcient is specied once in the
CONDATA= data set.
v NON_REPLIC=NONE indicates that constraint coefcients can be specied more than
once in the CONDATA= data set. NON_REPLIC=NONE is the default.
See the section How to Make the Data Read of PROC INTPOINT More Efcient on
page 129.
OPTIM_TIMER
indicates that the procedure is to issue a message to the SAS log giving the CPU time spent
doing optimization. This includes the time spent preprocessing, performing optimization, and
postprocessing. Not counted in that time is the rest of the procedure execution, which includes
reading the data and creating output SAS data sets.
The time spent optimizing can be small compared to the total CPU time used by the procedure.
This is especially true when the problem is quite small (e.g., fewer than 10,000 variables).
RHSOBS=charstr
species the keyword that identies a right-hand-side observation when using the sparse format
for data in the CONDATA= data set. The keyword is expected as a value of the SAS variable
in the CONDATA= data set named in the COLUMN list specication. The default value of the
RHSOBS= option is _RHS_ or _rhs_. If charstr is not a valid SAS variable name, enclose it
in quotes.
SCALE=scale
indicates that the NPSC side constraints or the LP constraints are to be scaled. Scaling is
useful when some coefcients are either much larger or much smaller than other coefcients.
PROC INTPOINT Statement ! 91
Scaling might make all coefcients have values that have a smaller range, and this can make
computations more stable numerically. Try the SCALE= option if PROC INTPOINT is unable
to solve a problem because of numerical instability. Specify
v SCALE=ROW, SCALE=CON, or SCALE=CONSTRAINT if you want the largest
absolute value of coefcients in each constraint to be about 1.0
v SCALE=COL, SCALE=COLUMN, or SCALE=NONARC if you want NPSC nonarc
variable columns or LP variable columns to be scaled so that the absolute value of the
largest constraint coefcient of that variable is near to 1
v SCALE=BOTH if you want the largest absolute value of coefcients in each constraint,
and the absolute value of the largest constraint coefcient of an NPSC nonarc variable or
LP variable to be near to 1. This is the default.
v SCALE=NONE if no scaling should be done
SHORTPATH
SP
species that PROC INTPOINT solve a shortest path problem. The INTPOINT procedure
nds the shortest path between the nodes specied in the SOURCE= option and the SINK=
option. The costs of arcs are their lengths. PROC INTPOINT automatically assigns a supply of
one ow unit to the SOURCE= node, and the SINK= node is assigned to have a one ow unit
demand. In this way, the SHORTPATH option sets up a shortest path problem as an equivalent
minimum cost problem.
If a network has one supply node (with supply of one unit) and one demand node (with demand
of one unit), you could specify the SHORTPATH option, with the SOURCE= and SINK=
nodes, even if the problem is not a shortest path problem. You then should not provide any
supply or demand data in the NODEDATA= data set or the ARCDATA= data set.
SINK=sinkname
SINKNODE=sinkname
identies the demand node. The SINK= option is useful when you specify the MAXFLOW
option or the SHORTPATH option and you need to specify toward which node the shortest
path or maximum ow is directed. The SINK= option also can be used when a minimum
cost problem has only one demand node. Rather than having this information in the ARC-
DATA= data set or the NODEDATA= data set, use the SINK= option with an accompanying
DEMAND= specication for this node. The SINK= option must be the name of a head node
of at least one arc; thus, it must have a character value. If the value of the SINK= option is not
a valid SAS character variable name (if, for example, it contains embedded blanks), it must be
enclosed in quotes.
SOURCE=sourcename
SOURCENODE=sourcename
identies a supply node. The SOURCE= option is useful when you specify the MAXFLOW or
the SHORTPATH option and need to specify from which node the shortest path or maximum
ow originates. The SOURCE= option also can be used when a minimum cost problem has
only one supply node. Rather than having this information in the ARCDATA= data set or the
NODEDATA= data set, use the SOURCE= option with an accompanying SUPPLY= amount
92 ! Chapter 4: The INTPOINT Procedure
of supply at this node. The SOURCE= option must be the name of a tail node of at least
one arc; thus, it must have a character value. If the value of the SOURCE= option is not a
valid SAS character variable name (if, for example, it contains embedded blanks), it must be
enclosed in quotes.
SPARSECONDATA
SCDATA
indicates that the CONDATA= data set has data in the sparse data format. Otherwise, it is
assumed that the data are in the dense format.
NOTE: If the SPARSECONDATA option is not specied, and you are running SAS software
Version 6 or you have specied
options validvarname=v6;
all NAME list variable values in the ARCDATA= data set are uppercased. See the section
Case Sensitivity on page 120.
SUPPLY=s
species the supply at the source node specied by the SOURCE= option. The SUPPLY=
option should be used only if the SOURCE= option is given in the PROC INTPOINT statement
and neither the SHORTPATH option nor the MAXFLOW option is specied. If you are solving
a minimum cost network problem and the SOURCE= option is used to identify the source
node and the SUPPLY= option is not specied, then by default the supply at the source node is
made equal to the networks total demand.
THRUNET
tells PROC INTPOINT to force through the network any excess supply (the amount by which
total supply exceeds total demand) or any excess demand (the amount by which total demand
exceeds total supply) as is required. If a network problem has unequal total supply and
total demand and the THRUNET option is not specied, PROC INTPOINT drains away the
excess supply or excess demand in an optimal manner. The consequences of specifying or
not specifying THRUNET are discussed in the section Balancing Total Supply and Total
Demand on page 127.
TYPEOBS=charstr
species the keyword that identies a type observation when using the sparse format for data
in the CONDATA= data set. The keyword is expected as a value of the SAS variable in the
CONDATA= data set named in the COLUMN list specication. The default value of the
TYPEOBS= option is _TYPE_ or _type_. If charstr is not a valid SAS variable name, enclose
it in quotes.
VERBOSE=v
limits the number of similar messages that are displayed on the SAS log.
For example, when reading the ARCDATA= data set, PROC INTPOINT might have cause to
issue the following message many times:
PROC INTPOINT Statement ! 93
ERROR: The HEAD list variable value in obs i in ARCDATA is
missing and the TAIL list variable value of this obs
is nonmissing. This is an incomplete arc specification.
If there are many observations that have this fault, messages that are similar are issued for only
the rst VERBOSE= such observations. After the ARCDATA= data set has been read, PROC
INTPOINT will issue the message
NOTE: More messages similar to the ones immediately above
could have been issued but were suppressed as
VERBOSE=v.
If observations in the ARCDATA= data set have this error, PROC INTPOINT stops and you
have to x the data. Imagine that this error is only a warning and PROC INTPOINT proceeded
to other operations such as reading the CONDATA= data set. If PROC INTPOINT nds there
are numerous errors when reading that data set, the number of messages issued to the SAS log
are also limited by the VERBOSE= option.
When PROC INTPOINT nishes and messages have been suppressed, the message
NOTE: To see all messages, specify VERBOSE=vmin.
is issued. The value of vmin is the smallest value that should be specied for the VERBOSE=
option so that all messages are displayed if PROC INTPOINT is run again with the same data
and everything else (except VERBOSE=vmin) unchanged.
The default value for the VERBOSE= option is 12.
ZERO2=z
Z2=z
species the zero tolerance level used when determining whether the nal solution has been
reached. ZERO2= is also used when outputting the solution to the CONOUT= data set. Values
within z of zero are set to 0.0, where z is the value of the ZERO2= option. Flows close to the
lower ow bound or capacity of arcs are reassigned those exact values. If there are nonarc
variables, values close to the lower or upper value bound of nonarc variables are reassigned
those exact values. When solving an LP problem, values close to the lower or upper value
bound of LP variables are reassigned those exact values.
The ZERO2= option works when determining whether optimality has been reached or whether
an element in the vector (^.
k
. ^,
k
. ^s
k
) is less than or greater than zero. It is crucial to
know that when determining the maximal value for the step length in the formula
(.
k1
. ,
k1
. s
k1
) = (.
k
. ,
k
. s
k
) (^.
k
. ^,
k
. ^s
k
)
See the description of the PDSTEPMULT= option for more details on this computation.
Two values are deemed to be close if one is within z of the other. The default value for the
ZERO2= option is 0.000001. Any value specied for the ZERO2= option that is < 0.0 or >
0.0001 is not valid.
94 ! Chapter 4: The INTPOINT Procedure
ZEROTOL=z
species the zero tolerance used when PROC INTPOINT must compare any real number
with another real number, or zero. For example, if . and , are real numbers, then for . to be
considered greater than ,, . must be at least , :. The ZEROTOL= option is used throughout
any PROC INTPOINT run.
ZEROTOL=z controls the way PROC INTPOINT performs all double precision comparisons;
that is, whether a double precision number is equal to, not equal to, greater than (or equal to),
or less than (or equal to) zero or some other double precision number. A double precision
number is deemed to be the same as another such value if the absolute difference between
them is less than or equal to the value of the ZEROTOL= option.
The default value for the ZEROTOL= option is 1.0E14. You can specify the ZEROTOL=
option in the INTPOINT statement. Valid values for the ZEROTOL= option must be > 0.0
and < 0.0001. Do not specify a value too close to zero as this defeats the purpose of the
ZEROTOL= option. Neither should the value be too large, as comparisons might be incorrectly
performed.
Interior Point Algorithm Options
FACT_METHOD=f
enables you to choose the type of algorithm used to factorize and solve the main linear systems
at each iteration of the interior point algorithm.
FACT_METHOD=LEFT_LOOKING is new for SAS 9.1.2. It uses algorithms described
in George, Liu, and Ng (2001). Left looking is one of the main methods used to perform
Cholesky optimization and, along with some recently developed implementation approaches,
can be faster and require less memory than other algorithms.
Specify FACT_METHOD=USE_OLD if you want the procedure to use the only factorization
available prior to SAS 9.1.2.
TOLDINF=t
RTOLDINF=t
species the allowed amount of dual infeasibility. In the section Interior Point Algorithmic
Details on page 48, the vector infeas
d
is dened. If all elements of this vector are _ t , the
solution is considered dual feasible. infeas
d
is replaced by a zero vector, making computations
faster. This option is the dual equivalent to the TOLPINF= option. Increasing the value of
the TOLDINF= option too much can lead to instability, but a modest increase can give the
algorithm added exibility and decrease the iteration count. Valid values for t are greater than
1.0E12. The default is 1.0E7.
TOLPINF=t
RTOLPINF=t
species the allowed amount of primal infeasibility. This option is the primal equivalent
to the TOLDINF= option. In the section Interior Point: Upper Bounds on page 52, the
vector infeas
b
is dened. In the section Interior Point Algorithmic Details on page 48, the
vector infeas
c
is dened. If all elements in these vectors are _ t , the solution is considered
PROC INTPOINT Statement ! 95
primal feasible. infeas
b
and infeas
c
are replaced by zero vectors, making computations faster.
Increasing the value of the TOLPINF= option too much can lead to instability, but a modest
increase can give the algorithm added exibility and decrease the iteration count. Valid values
for t are greater than 1.0E12. The default is 1.0E7.
TOLTOTDINF=t
RTOLTOTDINF=t
species the allowed total amount of dual infeasibility. In the section Interior Point Algorith-
mic Details on page 48, the vector infeas
d
is dened. If

n
i=1
infeas
di
_ t , the solution is
considered dual feasible. infeas
d
is replaced by a zero vector, making computations faster.
This option is the dual equivalent to the TOLTOTPINF= option. Increasing the value of the
TOLTOTDINF= option too much can lead to instability, but a modest increase can give the
algorithm added exibility and decrease the iteration count. Valid values for t are greater than
1.0E12. The default is 1.0E7.
TOLTOTPINF=t
RTOLTOTPINF=t
species the allowed total amount of primal infeasibility. This option is the primal equivalent
to the TOLTOTDINF= option. In the section Interior Point: Upper Bounds on page 52,
the vector infeas
b
is dened. In the section Interior Point Algorithmic Details on page 48,
the vector infeas
c
is dened. If

n
i=1
infeas
bi
_ t and

n
i=1
infeas
ci
_ t , the solution
is considered primal feasible. infeas
b
and infeas
c
are replaced by zero vectors, making
computations faster. Increasing the value of the TOLTOTPINF= option too much can lead
to instability, but a modest increase can give the algorithm added exibility and decrease the
iteration count. Valid values for t are greater than 1.0E12. The default is 1.0E7.
CHOLTINYTOL=c
RCHOLTINYTOL=c
species the cut-off tolerance for Cholesky factorization of the
-1
. If a diagonal value
drops below c, the row is essentially treated as dependent and is ignored in the factorization.
Valid values for c are between 1.0E30 and 1.0E6. The default value is 1.0E8.
DENSETHR=d
RDENSETHR=d
species the density threshold for Cholesky factorization. When the symbolic factoriza-
tion encounters a column of 1 (where 1 is the remaining unfactorized submatrix) that has
DENSETHR= proportion of nonzeros and the remaining part of 1 is at least 12 12, the
remainder of 1 is treated as dense. In practice, the lower right part of the Cholesky triangular
factor 1 is quite dense and it can be computationally more efcient to treat it as 100% dense.
The default value for d is 0.7. A specication of d _ 0.0 causes all dense processing; d _ 1.0
causes all sparse processing.
PDSTEPMULT=p
RPDSTEPMULT=p
species the step-length multiplier. The maximum feasible step-length chosen by the interior
point algorithm is multiplied by the value of the PDSTEPMULT= option. This number must be
less than 1 to avoid moving beyond the barrier. An actual step-length greater than 1 indicates
96 ! Chapter 4: The INTPOINT Procedure
numerical difculties. Valid values for p are between 0.01 and 0.999999. The default value is
0.99995.
In the section Interior Point Algorithmic Details on page 48, the solution of the next iteration
is obtained by moving along a direction from the current iterations solution:
(.
k1
. ,
k1
. s
k1
) = (.
k
. ,
k
. s
k
) (^.
k
. ^,
k
. ^s
k
)
where is the maximum feasible step-length chosen by the interior point algorithm. If _ 1,
then is reduced slightly by multiplying it by p. is a value as large as possible but _ 1.0
and not so large that an .
k1
i
or s
k1
i
of some variable i is too close to zero.
PRSLTYPE=p
IPRSLTYPE=p
Preprocessing the linear programming problem often succeeds in allowing some variables
and constraints to be temporarily eliminated from the resulting LP that must be solved. This
reduces the solution time and possibly also the chance that the optimizer will run into numerical
difculties. The task of preprocessing is inexpensive to do.
You control how much preprocessing to do by specifying PRSLTYPE=p, where p can be 1, 0,
1, 2, or 3:
1 Do not perform preprocessing. For most problems, specifying
PRSLTYPE= 1 is not recommended.
0 Given upper and lower bounds on each variable, the greatest and least
contribution to the row activity of each variable is computed. If these are
within the limits set by the upper and lower bounds on the row activity,
then the row is redundant and can be discarded. Otherwise, whenever
possible, the bounds on any of the variables are tightened. For example, if
all coefcients in a constraint are positive and all variables have zero lower
bounds, then the rows smallest contribution is zero. If the rhs value of this
constraint is zero, then if the constraint type is = or _, all the variables
in that constraint are xed to zero. These variables and the constraint are
removed. If the constraint type is _, the constraint is redundant. If the
rhs is negative and the constraint is _, the problem is infeasible. If just
one variable in a row is not xed, the row to used to impose an implicit
upper or lower bound on the variable and then this row is eliminated. The
preprocessor also tries to tighten the bounds on constraint right-hand sides.
1 When there are exactly two unxed variables with coefcients in an equality
constraint, one variable is solved in terms of the other. The problem will
have one less variable. The new matrix will have at least two fewer coef-
cients and one less constraint. In other constraints where both variables
appear, two coefcients are combined into one. PRSLTYPE=0 reductions
are also done.
PROC INTPOINT Statement ! 97
2 It may be possible to determine that an equality constraint is not constraining
a variable. That is, if all variables are nonnegative, then .

i
,
i
= 0 does
not constrain ., since it must be nonnegative if all the ,
i
s are nonnegative.
In this case, . is eliminated by subtracting this equation from all others
containing .. This is useful when the only other entry for . is in the
objective function. This reduction is performed if there is at most one other
nonobjective coefcient. PRSLTYPE=0 reductions are also done.
3 All possible reductions are performed. PRSLTYPE=3 is the default.
Preprocessing is iterative. As variables are xed and eliminated, and constraints are found to
be redundant and they too are eliminated, and as variable bounds and constraint right-hand
sides are tightened, the LP to be optimized is modied to reect these changes. Another
iteration of preprocessing of the modied LP may reveal more variables and constraints that
are eliminated, or tightened.
PRINTLEVEL2=p
is used when you want to see PROC INTPOINTs progress to the optimum. PROC INTPOINT
will produce a table on the SAS log. A row of the table is generated during each iteration and
may consist of values of
v the afne step complementarity
v the complementarity of the solution for the next iteration
v the total bound infeasibility

n
i=1
infeas
bi
(see the infeas
b
array in the section Interior
Point: Upper Bounds on page 52)
v the total constraint infeasibility

n
i=1
infeas
ci
(see the infeas
c
array in the section
Interior Point Algorithmic Details on page 48)
v the total dual infeasibility

n
i=1
infeas
di
(see the infeas
d
array in the section Interior
Point Algorithmic Details on page 48)
As optimization progresses, the values in all columns should converge to zero. If you specify
PRINTLEVEL2=2, all columns will appear in the table. If PRINTLEVEL2=1 is specied,
only the afne step complementarity and the complementarity of the solution for the next
iteration will appear. Some time is saved by not calculating the infeasibility values.
PRINTLEVEL2=2 is specied in all PROC INTPOINT runs in the section Examples: INT-
POINT Procedure on page 137.
RTTOL=r
species the zero tolerance used during the ratio test of the interior point algorithm. The ratio
test determines , the maximum feasible step length.
Valid values for r are greater than 1.0E14. The default value is 1.0E10.
In the section Interior Point Algorithmic Details on page 48, the solution of the next iteration
is obtained by moving along a direction from the current iterations solution:
(.
k1
. ,
k1
. s
k1
) = (.
k
. ,
k
. s
k
) (^.
k
. ^,
k
. ^s
k
)
98 ! Chapter 4: The INTPOINT Procedure
where is the maximum feasible step-length chosen by the interior point algorithm. If _ 1,
then is reduced slightly by multiplying it by the value of the PDSTEPMULT= option. is a
value as large as possible but _ 1.0 and not so large that an .
k1
i
or s
k1
i
of some variable i
is negative. When determining , only negative elements of ^. and ^s are important.
RTTOL=r indicates a number close to zero so that another number n is considered truly
negative if n _ r. Even though n < 0, if n > r, n may be too close to zero and may have
the wrong sign due to rounding error.
Interior Point Algorithm Options: Stopping Criteria
MAXITERB=m
IMAXITERB=m
species the maximum number of iterations that the interior point algorithm can perform. The
default value for m is 100. One of the most remarkable aspects of the interior point algorithm
is that for most problems, it usually needs to do a small number of iterations, no matter the
size of the problem.
PDGAPTOL=p
RPDGAPTOL=p
species the primal-dual gap or duality gap tolerance. Duality gap is dened in the section
Interior Point Algorithmic Details on page 48. If the relative gap (duality gap,(c
T
.))
between the primal and dual objectives is smaller than the value of the PDGAPTOL= option
and both the primal and dual problems are feasible, then PROC INTPOINT stops optimization
with a solution that is deemed optimal. Valid values for p are between 1.0E12 and 1.0E1.
The default is 1.0E7.
STOP_C=s
is used to determine whether optimization should stop. At the beginning of each iteration,
if complementarity (the value of the Complem-ity column in the table produced when you
specify PRINTLEVEL2=1 or PRINTLEVEL2=2) is <=s, optimization will stop. This option
is discussed in the section Stopping Criteria on page 134.
STOP_DG=s
is used to determine whether optimization should stop. At the beginning of each iteration, if
the duality gap (the value of the Duality_gap column in the table produced when you specify
PRINTLEVEL2=1 or PRINTLEVEL2=2) is <= s, optimization will stop. This option is
discussed in the section Stopping Criteria on page 134.
STOP_IB=s
is used to determine whether optimization should stop. At the beginning of each iteration, if
total bound infeasibility

n
i=1
infeas
bi
(see the infeas
b
array in the section Interior Point:
Upper Bounds on page 52; this value appears in the Tot_infeasb column in the table produced
when you specify PRINTLEVEL2=1 or PRINTLEVEL2=2) is <= s, optimization will stop.
This option is discussed in the section Stopping Criteria on page 134.
PROC INTPOINT Statement ! 99
STOP_IC=s
is used to determine whether optimization should stop. At the beginning of each iteration, if
total constraint infeasibility

n
i=1
infeas
ci
(see the infeas
c
array in the section Interior Point
Algorithmic Details on page 48; this value appears in the Tot_infeasc column in the table
produced when you specify PRINTLEVEL2=2) is <=s, optimization will stop. This option
is discussed in the section Stopping Criteria on page 134.
STOP_ID=s
is used to determine whether optimization should stop. At the beginning of each iteration,
if total dual infeasibility

n
i=1
infeas
di
(see the infeas
d
array in the section Interior Point
Algorithmic Details on page 48; this value appears in the Tot_infeasd column in the table
produced when you specify PRINTLEVEL2=2) is <=s, optimization will stop. This option
is discussed in the section Stopping Criteria on page 134.
AND_STOP_C=s
is used to determine whether optimization should stop. At the beginning of each iteration,
if complementarity (the value of the Complem-ity column in the table produced when you
specify PRINTLEVEL2=1 or PRINTLEVEL2=2) is <= s, and the other conditions related
to other AND_STOP parameters are also satised, optimization will stop. This option is
discussed in the section Stopping Criteria on page 134.
AND_STOP_DG=s
is used to determine whether optimization should stop. At the beginning of each iteration, if
the duality gap (the value of the Duality_gap column in the table produced when you specify
PRINTLEVEL2=1 or PRINTLEVEL2=2) is <=s, and the other conditions related to other
AND_STOP parameters are also satised, optimization will stop. This option is discussed in
the section Stopping Criteria on page 134.
AND_STOP_IB=s
is used to determine whether optimization should stop. At the beginning of each iteration, if
total bound infeasibility

n
i=1
infeas
bi
(see the infeas
b
array in the section Interior Point:
Upper Bounds on page 52; this value appears in the Tot_infeasb column in the table produced
when you specify PRINTLEVEL2=1 or PRINTLEVEL2=2) is <=s, and the other conditions
related to other AND_STOP parameters are also satised, optimization will stop. This option
is discussed in the section Stopping Criteria on page 134.
AND_STOP_IC=s
is used to determine whether optimization should stop. At the beginning of each iteration, if
total constraint infeasibility

n
i=1
infeas
ci
(see the infeas
c
array in the section Interior Point
Algorithmic Details on page 48; this value appears in the Tot_infeasc column in the table
produced when you specify PRINTLEVEL2=2) is <=s, and the other conditions related
to other AND_STOP parameters are also satised, optimization will stop. This option is
discussed in the section Stopping Criteria on page 134.
AND_STOP_ID=s
is used to determine whether optimization should stop. At the beginning of each iteration,
if total dual infeasibility

n
i=1
infeas
bi
(see the infeas
b
array in the section Interior Point:
Upper Bounds on page 52; this value appears in the Tot_infeasb column in the table produced
when you specify PRINTLEVEL2=1 or PRINTLEVEL2=2) is > s, optimization will continue.
This option is discussed in the section Stopping Criteria on page 134.
KEEPGOING_IC=s
is used to determine whether optimization should stop. When a stopping condition is met, if
total constraint infeasibility

n
i=1
infeas
ci
(see the infeas
c
array in the section Interior Point
Algorithmic Details on page 48; this value appears in the Tot_infeasc column in the table
produced when you specify PRINTLEVEL2=2) is > s, optimization will continue. This option
is discussed in the section Stopping Criteria on page 134.
KEEPGOING_ID=s
is used to determine whether optimization should stop. When a stopping condition is met,
if total dual infeasibility

n
i=1
infeas
di
(see the infeas
d
array in the section Interior Point
Algorithmic Details on page 48; this value appears in the Tot_infeasd column in the table
produced when you specify PRINTLEVEL2=2) is > s, optimization will continue. This option
is discussed in the section Stopping Criteria on page 134.
AND_KEEPGOING_C=s
is used to determine whether optimization should stop. When a stopping condition is met,
if complementarity (the value of the Complem-ity column in the table produced when you
specify PRINTLEVEL2=1 or PRINTLEVEL2=2) is > s, and the other conditions related
to other AND_KEEPGOING parameters are also satised, optimization will continue. This
option is discussed in the section Stopping Criteria on page 134.
AND_KEEPGOING_DG=s
is used to determine whether optimization should stop. When a stopping condition is met, if
the duality gap (the value of the Duality_gap column in the table produced when you specify
PRINTLEVEL2=1 or PRINTLEVEL2=2) is > s, and the other conditions related to other
AND_KEEPGOING parameters are also satised, optimization will continue. This option is
discussed in the section Stopping Criteria on page 134.
CAPACITY Statement ! 101
AND_KEEPGOING_IB=s
is used to determine whether optimization should stop. When a stopping condition is met, if
total bound infeasibility

n
i=1
infeas
bi
(see the infeas
b
array in the section Interior Point:
Upper Bounds on page 52; this value appears in the Tot_infeasb column in the table pro-
duced when you specify PRINTLEVEL2=2) is > s, and the other conditions related to other
AND_KEEPGOING parameters are also satised, optimization will continue. This option is
discussed in the section Stopping Criteria on page 134.
AND_KEEPGOING_IC=s
is used to determine whether optimization should stop. When a stopping condition is met,
if total constraint infeasibility

n
i=1
infeas
ci
(see the infeas
c
array in the section Interior
Point Algorithmic Details on page 48; this value appears in the Tot_infeasc column in the
table produced when you specify PRINTLEVEL2=2) is > s, and the other conditions related
to other AND_KEEPGOING parameters are also satised, optimization will continue. This
option is discussed in the section Stopping Criteria on page 134.
AND_KEEPGOING_ID=s
is used to determine whether optimization should stop. When a stopping condition is met,
if total dual infeasibility

n
i=1
infeas
di
(see the infeas
d
array in the section Interior Point
Algorithmic Details on page 48; this value appears in the Tot_infeasd column in the table
produced when you specify PRINTLEVEL2=2) is > s, and the other conditions related to other
AND_KEEPGOING parameters are also satised, optimization will continue. This option is
discussed in the section Stopping Criteria on page 134.
CAPACITY Statement
CAPACITY variable ;
CAPAC variable ;
UPPERBD variable ;
The CAPACITY statement identies the SAS variable in the ARCDATA= data set that contains the
maximum feasible ow or capacity of the network arcs. If an observation contains nonarc variable
information, the CAPACITY list variable is the upper value bound for the nonarc variable named in
the NAME list variable in that observation.
When solving an LP, the CAPACITY statement identies the SAS variable in the ARCDATA= data
set that contains the maximum feasible value of the LP variables.
The CAPACITY list variable must have numeric values. It is not necessary to have a CAPACITY
statement if the name of the SAS variable is _CAPAC_, _UPPER_, _UPPERBD, or _HI_.
102 ! Chapter 4: The INTPOINT Procedure
COEF Statement
COEF variables ;
The COEF list is used with the sparse input format of the CONDATA= data set. The COEF list can
contain more than one SAS variable, each of which must have numeric values. If the COEF statement
is not specied, the CONDATA= data set is searched and SAS variables with names beginning with
_COE are used. The number of SAS variables in the COEF list must be no greater than the number
of SAS variables in the ROW list.
The values of the COEF list variables in an observation can be interpreted differently than these
variables values in other observations. The values can be coefcients in the side constraints, costs
and objective function coefcients, bound data, constraint type data, or rhs data. If the COLUMN
list variable has a value that is a name of an arc or a nonarc variable, the i th COEF list variable is
associated with the constraint or special row name named in the i th ROW list variable. Otherwise,
the COEF list variables indicate type values, rhs values, or missing values.
When solving an LP, the values of the COEF list variables in an observation can be interpreted
differently than these variables values in other observations. The values can be coefcients in
the constraints, objective function coefcients, bound data, constraint type data, or rhs data. If the
COLUMN list variable has a value that is a name of an LP variable, the i th COEF list variable is
associated with the constraint or special row name named in the i th ROW list variable. Otherwise,
the COEF list variables indicate type values, rhs values, or missing values.
COLUMN Statement
COLUMN variable ;
The COLUMN list is used with the sparse input format of the CONDATA= data set.
This list consists of one SAS variable in the CONDATA= data set that has as values the names of arc
variables, nonarc variables, or missing values. When solving an LP, this list consists of one SAS
variable in the CONDATA= data set that has as values the names of LP variables, or missing values.
Some, if not all, of these values also can be values of the NAME list variables of the ARCDATA=
data set. The COLUMN list variable can have other special values (Refer to the TYPEOBS= and
RHSOBS= options). If the COLUMN list is not specied after the PROC INTPOINT statement, the
CONDATA= data set is searched and a SAS variable named _COLUMN_ is used. The COLUMN list
variable must have character values.
COST Statement ! 103
COST Statement
COST variable ;
OBJFN variable ;
The COST statement identies the SAS variable in the ARCDATA= data set that contains the per
unit ow cost through an arc. If an observation contains nonarc variable information, the value of the
COST list variable is the objective function coefcient of the nonarc variable named in the NAME
list variable in that observation.
If solving an LP, the COST statement identies the SAS variable in the ARCDATA= data set that
contains the per unit objective function coefcient of an LP variable named in the NAME list variable
in that observation.
The COST list variable must have numeric values. It is not necessary to specify a COST statement if
the name of the SAS variable is _COST_ or _LENGTH_.
DEMAND Statement
DEMAND variable ;
The DEMAND statement identies the SAS variable in the ARCDATA= data set that contains the
demand at the node named in the corresponding HEADNODE list variable. The DEMAND list
variable must have numeric values. It is not necessary to have a DEMAND statement if the name of
this SAS variable is _DEMAND_. See the section Missing S Supply and Missing D Demand Values
on page 123 for cases when the SUPDEM list variable values can have other values. There should be
no DEMAND statement if you are solving an LP.
HEADNODE Statement
HEADNODE variable ;
HEAD variable ;
TONODE variable ;
TO variable ;
The HEADNODE statement species the SAS variable that must be present in the ARCDATA= data
set that contains the names of nodes toward which arcs are directed. It is not necessary to have a
HEADNODE statement if the name of the SAS variable is _HEAD_ or _TO_. The HEADNODE
variable must have character values.
There should be no HEAD statement if you are solving an LP.
104 ! Chapter 4: The INTPOINT Procedure
ID Statement
ID variables ;
The ID statement species SAS variables containing values for pre- and post-optimal processing and
analysis. These variables are not processed by PROC INTPOINT but are read by the procedure and
written in the CONOUT= data set. For example, imagine a network used to model a distribution
system. The SAS variables listed on the ID statement can contain information on the type of vehicle,
the transportation mode, the condition of the road, the time to complete the journey, the name of the
driver, or other ancillary information useful for report writing or describing facets of the operation
that do not have bearing on the optimization. The ID variables can be character, numeric, or both.
If no ID list is specied, the procedure forms an ID list of all SAS variables not included in any other
implicit or explicit list specication. If the ID list is specied, any SAS variables in the ARCDATA=
data set not in any list are dropped and do not appear in the CONOUT= data set.
LO Statement
LO variable ;
LOWERBD variable ;
MINFLOW variable ;
The LO statement identies the SAS variable in the ARCDATA= data set that contains the minimum
feasible ow or lower ow bound for arcs in the network. If an observation contains nonarc variable
information, the LO list variable has the value of the lower bound for the nonarc variable named
in the NAME list variable. If solving an LP, the LO statement identies the SAS variable in the
ARCDATA= data set that contains the lower value bound for LP variables. The LO list variables
must have numeric values. It is not necessary to have a LO statement if the name of this SAS variable
is _LOWER_, _LO_, _LOWERBD, or _MINFLOW.
NAME Statement
NAME variable ;
ARCNAME variable ;
VARNAME variable ;
Each arc and nonarc variable in an NPSC, or each variable in an LP, that has data in the CONDATA=
data set must have a unique name. This variable is identied in the ARCDATA= data set. The NAME
list variable must have character values (see the NAMECTRL= option in the PROC INTPOINT
statement for more information). It is not necessary to have a NAME statement if the name of this
SAS variable is _NAME_.
NODE Statement ! 105
NODE Statement
NODE variable ;
The NODE list variable, which must be present in the NODEDATA= data set, has names of nodes
as values. These values must also be values of the TAILNODE list variable, the HEADNODE list
variable, or both. If this list is not explicitly specied, the NODEDATA= data set is searched for a
SAS variable with the name _NODE_. The NODE list variable must have character values.
QUIT Statement
QUIT ;
The QUIT statement indicates that PROC INTPOINT is to stop immediately. The solution is not
saved in the CONOUT= data set. The QUIT statement has no options.
RHS Statement
RHS variable ;
The RHS variable list is used when the dense format of the CONDATA= data set is used. The values
of the SAS variable specied in the RHS list are constraint right-hand-side values. If the RHS list is
not specied, the CONDATA= data set is searched and a SAS variable with the name _RHS_ is used.
The RHS list variable must have numeric values. If there is no RHS list and no SAS variable named
_RHS_, all constraints are assumed to have zero right-hand-side values.
ROW Statement
ROW variables ;
The ROW list is used when either the sparse or the dense format of the CONDATA= data set is being
used. SAS variables in the ROW list have values that are constraint or special row names. The SAS
variables in the ROW list must have character values.
If the dense data format is used, there must be only one SAS variable in this list. In this case, if a
ROW list is not specied, the CONDATA= data set is searched and the SAS variable with the name
_ROW_ or _CON_ is used. If that search fails to nd a suitable SAS variable, data for each constraint
must reside in only one observation.
If the sparse data format is used and the ROW statement is not specied, the CONDATA= data set
is searched and SAS variables with names beginning with _ROW or _CON are used. The number
106 ! Chapter 4: The INTPOINT Procedure
of SAS variables in the ROW list must not be less than the number of SAS variables in the COEF
list. The i th ROW list variable is paired with the i th COEF list variable. If the number of ROW
list variables is greater than the number of COEF list variables, the last ROW list variables have no
COEF partner. These ROW list variables that have no corresponding COEF list variable are used in
observations that have a TYPE list variable value. All ROW list variable values are tagged as having
the type indicated. If there is no TYPE list variable, all ROW list variable values are constraint
names.
RUN Statement
RUN ;
The RUN statement causes optimization to be started. The RUN statement has no options. If PROC
INTPOINT is called and is not terminated because of an error or a QUIT statement, and you have
not used a RUN statement, a RUN statement is assumed implicitly as the last statement of PROC
INTPOINT. Therefore, PROC INTPOINT reads that data, performs optimization, and saves the
optimal solution in the CONOUT= data set.
SUPDEM Statement
SUPDEM variable ;
The SAS variable in this list, which must be present in the NODEDATA= data set, contains supply
and demand information for the nodes in the NODE list. A positive SUPDEM list variable value
s (s > 0) denotes that the node named in the NODE list variable can supply s units of ow. A
negative SUPDEM list variable value J (J > 0) means that this node demands J units of ow. If a
SAS variable is not explicitly specied, a SAS variable with the name _SUPDEM_ or _SD_ in the
NODEDATA= data set is used as the SUPDEM variable. If a node is a transshipment node (neither
a supply nor a demand node), an observation associated with this node need not be present in the
NODEDATA= data set. If present, the SUPDEM list variable value must be zero or a missing value.
See the section Missing S Supply and Missing D Demand Values on page 123 for cases when the
SUPDEM list variable values can have other values.
SUPPLY Statement
SUPPLY variable ;
The SUPPLY statement identies the SAS variable in the ARCDATA= data set that contains the
supply at the node named in that observations TAILNODE list variable. If a tail node does not
supply ow, use zero or a missing value for the observations SUPPLY list variable value. If a tail
node has supply capability, a missing value indicates that the supply quantity is given in another
TAILNODE Statement ! 107
observation. It is not necessary to have a SUPPLY statement if the name of this SAS variable is
_SUPPLY_. See the section Missing S Supply and Missing D Demand Values on page 123 for
cases when the SUPDEM list variable values can have other values. There should be no SUPPLY
statement if you are solving an LP.
TAILNODE Statement
TAILNODE variable ;
TAIL variable ;
FROMNODE variable ;
FROM variable ;
The TAILNODE statement species the SAS variable that must (when solving an NPSC problem)
be present in the ARCDATA= data set that has as values the names of tail nodes of arcs. The
TAILNODE variable must have character values. It is not necessary to have a TAILNODE statement
if the name of the SAS variable is _TAIL_ or _FROM_. If the TAILNODE list variable value is missing,
it is assumed that the observation of the ARCDATA= data set contains information concerning a
nonarc variable. There should be no TAILNODE statement if you are solving an LP.
TYPE Statement
TYPE variable ;
CONTYPE variable ;
The TYPE list, which is optional, names the SAS variable that has as values keywords that indicate
either the constraint type for each constraint or the type of special rows in the CONDATA= data
set. The values of the TYPE list variable also indicate, in each observation of the CONDATA= data
set, how values of the VAR or COEF list variables are to be interpreted and how the type of each
constraint or special row name is determined. If the TYPE list is not specied, the CONDATA= data
set is searched and a SAS variable with the name _TYPE_ is used. Valid keywords for the TYPE
variable are given below. If there is no TYPE statement and no other method is used to furnish type
information (see the DEFCONTYPE= option), all constraints are assumed to be of the type less
than or equal to and no special rows are used. The TYPE list variable must have character values
and can be used when the data in the CONDATA= data set is in either the sparse or the dense format.
If the TYPE list variable value has a * as its rst character, the observation is ignored because it is a
comment observation.
108 ! Chapter 4: The INTPOINT Procedure
TYPE List Variable Values
The following are valid TYPE list variable values. The letters in boldface denote the characters that
PROC INTPOINT uses to determine what type the value suggests. You need to have at least these
characters. In the following list, the minimal TYPE list variable values have additional characters to
aid you in remembering these values.
< less than or equal to (_)
= equal to (=)
> greater than or equal to (_)
CAPAC capacity
COST cost
EQ equal to
FREE free row (used only for linear programs solved by interior point)
GE greater than or equal to
LE less than or equal to
LOWERBD lower ow or value bound
LOWblank lower ow or value bound
MAXIMIZE maximize (opposite of cost)
MINIMIZE minimize (same as cost)
OBJECTIVE objective function (same as cost)
RHS rhs of constraint
TYPE type of constraint
UPPCOST reserved for future use
UNREST unrestricted variable (used only for linear programs solved by inte-
rior point)
UPPER upper value bound or capacity; second letter must not be N
The valid TYPE list variable values in function order are
v LE less than or equal to (_)
v EQ equal to (=)
v GE greater than or equal to (_)
v COST
MINIMIZE
MAXIMIZE
OBJECTIVE
cost or objective function coefcient
v CAPAC
UPPER
capacity or upper value bound
v LOWERBD
LOWblank
lower ow or value bound
VAR Statement ! 109
v RHS rhs of constraint
v TYPE type of constraint
A TYPE list variable value that has the rst character + causes the observation to be treated as a
comment. If the rst character is a negative sign, then _ is the type. If the rst character is a zero,
then =is the type. If the rst character is a positive number, then _ is the type.
VAR Statement
VAR variables ;
The VAR variable list is used when the dense data format is used for the CONDATA= data set. The
names of these SAS variables are also names of the arc and nonarc variables that have data in the
CONDATA= data set. If solving an LP, the names of these SAS variables are also names of the LP
variables. If no explicit VAR list is specied, all numeric SAS variables in the CONDATA= data
set that are not in other SAS variable lists are put onto the VAR list. The VAR list variables must
have numeric values. The values of the VAR list variables in some observations can be interpreted
differently than in other observations. The values can be coefcients in the side constraints, costs and
objective function coefcients, or bound data. When solving an LP, the values of the SAS variables
in the VAR list can be constraint coefcients, objective function coefcients, or bound data. How
these numeric values are interpreted depends on the value of each observations TYPE or ROW list
variable value. If there are no TYPE list variables, the VAR list variable values are all assumed to be
side constraint coefcients.
Details: INTPOINT Procedure
Input Data Sets
PROC INTPOINT is designed so that there are as few rules as possible that you must obey when
inputting a problems data. Raw data are acceptable. This should cut the amount of processing
required to groom the data before it is input to PROC INTPOINT. Data formats are so exible
that, due to space restrictions, all possible forms for a problems data are not shown here. Try any
reasonable form for your problems data; it should be acceptable. PROC INTPOINT will outline its
objections.
You can supply the same piece of data several ways. You do not have to restrict yourself to using any
particular one. If you use several ways, PROC INTPOINT checks that the data are consistent each
time that the data are encountered. After all input data sets have been read, data are merged so that
the problem is described completely. The observations can be in any order.
110 ! Chapter 4: The INTPOINT Procedure
ARCDATA= Data Set
See the section Getting Started: NPSC Problems on page 63 and the section Introductory NPSC
Example on page 64 for a description of this input data set.
NOTE: Information for an arc or nonarc variable can be specied in more than one observation. For
example, consider an arc directed from node A toward node B that has a cost of 50, capacity of 100,
and lower ow bound of 10 ow units. Some possible observations in the ARCDATA= data set are
as follows:
_tail_ _head_ _cost_ _capac_ _lo_
A B 50 . .
A B . 100 .
A B . . 10
A B 50 100 .
A B . 100 10
A B 50 . 10
A B 50 100 10
Similarly, for a nonarc variable that has an upper bound of 100, a lower bound of 10, and an objective
function coefcient of 50, the _TAIL_ and _HEAD_ values are missing.
When solving an LP that has an LP variable named my_var with an upper bound of 100, a lower bound
of 10, and an objective function coefcient of 50, some possible observations in the ARCDATA=
data set are
_name_ _cost_ _capac_ _lo_
my_var 50 . .
my_var . 100 .
my_var . . 10
my_var 50 100 .
my_var . 100 10
my_var 50 . 10
my_var 50 100 10
CONDATA= Data Set
Regardless of whether the data in the CONDATA= data set is in the sparse or dense format, you will
receive a warning if PROC INTPOINT nds a constraint row that has no coefcients. You will also
be warned if any nonarc or LP variable has no constraint coefcients.
Dense Input Format
If the dense format is used, most SAS variables in the CONDATA= data set belong to the VAR list.
The names of the SAS variables belonging to this list have names of arc and nonarc variables or, if
solving an LP, names of the LP variables. These names can be values of the SAS variables in the
ARCDATA= data set that belong to the NAME list, or names of nonarc variables, or names in the
form tail_head, or any combination of these three forms. Names in the form tail_head are default
arc names, and if you use them, you must specify node names in the ARCDATA= data set (values of
the TAILNODE and HEADNODE list variables).
Input Data Sets ! 111
The CONDATA= data set can have three other SAS variables belonging, respectively, to the ROW,
the TYPE, and the RHS lists. The CONDATA= data set of the oil industry example in the section
Introductory NPSC Example on page 64 uses the dense data format.
Consider the SAS code that creates a dense format CONDATA= data set that has data for three
constraints. This data set was used in the section Introductory NPSC Example on page 64.
data cond1;
input m_e_ref1 m_e_ref2 thruput1 r1_gas thruput2 r2_gas
_type_ $ _rhs_;
datalines;
-2 . 1 . . . >= -15
. -2 . . 1 . GE -15
. . -3 4 . . EQ 0
. . . . -3 4 = 0
;
You can use nonconstraint type values to furnish data on costs, capacities, lower ow bounds (and, if
there are nonarc or LP variables, objective function coefcients and upper and lower bounds). You
need not have such (or as much) data in the ARCDATA= data set. The rst three observations in the
following data set are examples of observations that provide cost, capacity, and lower bound data.
data cond1b;
input m_e_ref1 m_e_ref2 thruput1 r1_gas thruput2 r2_gas
_type_ $ _rhs_;
datalines;
63 81 200 . 220 . cost .
95 80 175 140 100 100 capac .
20 10 50 . 35 . lo .
-2 . 1 . . . >= -15
. -2 . . 1 . GE -15
. . -3 4 . . EQ 0
. . . . -3 4 = 0
;
If a ROW list variable is used, the data for a constraint can be spread over more than one observation.
To illustrate, the data for the rst constraint (which is called con1) and the cost and capacity data (in
special rows called costrow and caprow, respectively) are spread over more than one observation in
the following data set.
data cond1c;
input _row_ $
m_e_ref1 m_e_ref2 thruput1 r1_gas thruput2 r2_gas
_type_ $ _rhs_;
datalines;
costrow 63 . . . . . . .
costrow . 81 200 . . . cost .
. . . . . 220 . cost .
caprow . . . . . . capac .
caprow 95 . 175 . 100 100 . .
caprow . 80 175 140 . . . .
lorow 20 10 50 . 35 . lo .
con1 -2 . 1 . . . . .
con1 . . . . . . >= -15
112 ! Chapter 4: The INTPOINT Procedure
con2 . -2 . . 1 . GE -15
con3 . . -3 4 . . EQ 0
con4 . . . . -3 4 = 0
;
Using both ROW and TYPE lists, you can use special row names. Examples of these are costrow
and caprow in the last data set. It should be restated that in any of the input data sets of PROC
INTPOINT, the order of the observations does not matter. However, the CONDATA= data set can be
read more quickly if PROC INTPOINT knows what type of constraint or special row a ROW list
variable value is. For example, when the rst observation is read, PROC INTPOINT does not know
whether costrow is a constraint or special row and how to interpret the value 63 for the arc with the
name m_e_ref1. When PROC INTPOINT reads the second observation, it learns that costrow has cost
type and that the values 81 and 200 are costs. When the entire CONDATA= data set has been read,
PROC INTPOINT knows the type of all special rows and constraints. Data that PROC INTPOINT
had to set aside (such as the rst observation 63 value and the costrow ROW list variable value, which
at the time had unknown type, but is subsequently known to be a cost special row) is reprocessed.
During this second pass, if a ROW list variable value has unassigned constraint or special row type,
it is treated as a constraint with DEFCONTYPE= (or DEFCONTYPE= default) type. Associated
VAR list variable values are coefcients of that constraint.
Sparse Input Format
The side constraints usually become sparse as the problem size increases. When the sparse data
format of the CONDATA= data set is used, only nonzero constraint coefcients must be specied.
Remember to specify the SPARSECONDATA option in the PROC INTPOINT statement. With the
sparse method of specifying constraint information, the names of arc and nonarc variables or, if
solving an LP, the names of LP variables do not have to be valid SAS variable names.
A sparse format CONDATA= data set for the oil industry example in the section Introductory NPSC
Example on page 64 is displayed below.
title 'Setting Up Condata = Cond2 for PROC INTPOINT';
data cond2;
input _column_ $ _row1 $ _coef1 _row2 $ _coef2 ;
datalines;
m_e_ref1 con1 -2 . .
m_e_ref2 con2 -2 . .
thruput1 con1 1 con3 -3
r1_gas . . con3 4
thruput2 con2 1 con4 -3
r2_gas . . con4 4
_type_ con1 1 con2 1
_type_ con3 0 con4 0
_rhs_ con1 -15 con2 -15
;
Recall that the COLUMN list variable values _type_ and _rhs_ are the default values of the TYPE-
OBS= and RHSOBS= options. Also, the default rhs value of constraints (con3 and con4) is zero. The
third to last observation has the value _type_ for the COLUMN list variable. The _ROW1 variable
value is con1, and the _COEF1_ variable has the value 1. This indicates that the constraint con1 is
greater than or equal to type (because the value 1 is greater than zero). Similarly, the data in the
second to last observations _ROW2 and _COEF2 variables indicate that con2 is an equality constraint
(0 equals zero).
Input Data Sets ! 113
An alternative, using a TYPE list variable, is
title 'Setting Up Condata = Cond3 for PROC INTPOINT';
data cond3;
input _column_ $ _row1 $ _coef1 _row2 $ _coef2 _type_ $ ;
datalines;
m_e_ref1 con1 -2 . . >=
m_e_ref2 con2 -2 . . .
thruput1 con1 1 con3 -3 .
r1_gas . . con3 4 .
thruput2 con2 1 con4 -3 .
r2_gas . . con4 4 .
. con3 . con4 . eq
. con1 -15 con2 -15 ge
;
If the COLUMN list variable is missing in a particular observation (the last 2 observations in the
data set cond3, for instance), the constraints named in the ROW list variables all have the constraint
type indicated by the value in the TYPE list variable. It is for this type of observation that you are
allowed more ROW list variables than COEF list variables. If corresponding COEF list variables are
not missing (for example, the last observation in the data set cond3), these values are the rhs values
of those constraints. Therefore, you can specify both constraint type and rhs in the same observation.
As in the previous CONDATA= data set, if the COLUMN list variable is an arc or nonarc variable,
the COEF list variable values are coefcient values for that arc or nonarc variable in the constraints
indicated in the corresponding ROW list variables. If in this same observation the TYPE list variable
contains a constraint type, all constraints named in the ROW list variables in that observation have
this constraint type (for example, the rst observation in the data set cond3). Therefore, you can
specify both constraint type and coefcient information in the same observation.
Also note that DEFCONTYPE=EQ could have been specied, saving you from having to include in
the data that con3 and con4 are of this type.
In the oil industry example, arc costs, capacities, and lower ow bounds are presented in the
ARCDATA= data set. Alternatively, you could have used the following input data sets. The arcd2
data set has only two SAS variables. For each arc, there is an observation in which the arcs tail and
head node are specied.
title3 'Setting Up Arcdata = Arcd2 for PROC INTPOINT';
data arcd2;
input _from_&$11. _to_&$15. ;
datalines;
middle east refinery 1
middle east refinery 2
u.s.a. refinery 1
u.s.a. refinery 2
refinery 1 r1
refinery 2 r2
r1 ref1 gas
r1 ref1 diesel
r2 ref2 gas
r2 ref2 diesel
ref1 gas servstn1 gas
114 ! Chapter 4: The INTPOINT Procedure
ref1 gas servstn2 gas
ref1 diesel servstn1 diesel
ref1 diesel servstn2 diesel
ref2 gas servstn1 gas
ref2 gas servstn2 gas
ref2 diesel servstn1 diesel
ref2 diesel servstn2 diesel
;
title 'Setting Up Condata = Cond4 for PROC INTPOINT';
data cond4;
input _column_&$27. _row1 $ _coef1 _row2 $ _coef2 _type_ $ ;
datalines;
. con1 -15 con2 -15 ge
. costrow . . . cost
. . . caprow . capac
middle east_refinery 1 con1 -2 . . .
middle east_refinery 2 con2 -2 . . .
refinery 1_r1 con1 1 con3 -3 .
r1_ref1 gas . . con3 4 =
refinery 2_r2 con2 1 con4 -3 .
r2_ref2 gas . . con4 4 eq
middle east_refinery 1 costrow 63 caprow 95 .
middle east_refinery 2 costrow 81 caprow 80 .
u.s.a._refinery 1 costrow 55 . . .
u.s.a._refinery 2 costrow 49 . . .
refinery 1_r1 costrow 200 caprow 175 .
refinery 2_r2 costrow 220 caprow 100 .
r1_ref1 gas . . caprow 140 .
r1_ref1 diesel . . caprow 75 .
r2_ref2 gas . . caprow 100 .
r2_ref2 diesel . . caprow 75 .
ref1 gas_servstn1 gas costrow 15 caprow 70 .
ref1 gas_servstn2 gas costrow 22 caprow 60 .
ref1 diesel_servstn1 diesel costrow 18 . . .
ref1 diesel_servstn2 diesel costrow 17 . . .
ref2 gas_servstn1 gas costrow 17 caprow 35 .
ref2 gas_servstn2 gas costrow 31 . . .
ref2 diesel_servstn1 diesel costrow 36 . . .
ref2 diesel_servstn2 diesel costrow 23 . . .
middle east_refinery 1 . 20 . . lo
middle east_refinery 2 . 10 . . lo
refinery 1_r1 . 50 . . lo
refinery 2_r2 . 35 . . lo
ref2 gas_servstn1 gas . 5 . . lo
;
The rst observation in the cond4 data set denes con1 and con2 as greater than or equal to (_)
constraints that both (by coincidence) have rhs values of -15. The second observation denes the
special row costrow as a cost row. When costrow is a ROW list variable value, the associated COEF
list variable value is interpreted as a cost or objective function coefcient. PROC INTPOINT has to
do less work if constraint names and special rows are dened in observations near the top of a data
set, but this is not a strict requirement. The fourth to ninth observations contain constraint coefcient
Input Data Sets ! 115
data. Observations seven and nine have TYPE list variable values that indicate that constraints
con3 and con4 are equality constraints. The last ve observations contain lower ow bound data.
Observations that have an arc or nonarc variable name in the COLUMN list variable, a nonconstraint
type TYPE list variable value, and a value in (one of) the COEF list variables are valid.
The following data set is equivalent to the cond4 data set.
title 'Setting Up Condata = Cond5 for PROC INTPOINT';
data cond5;
input _column_&$27. _row1 $ _coef1 _row2 $ _coef2 _type_ $ ;
datalines;
middle east_refinery 1 con1 -2 costrow 63 .
middle east_refinery 2 con2 -2 lorow 10 .
refinery 1_r1 . . con3 -3 =
r1_ref1 gas caprow 140 con3 4 .
refinery 2_r2 con2 1 con4 -3 .
r2_ref2 gas . . con4 4 eq
. CON1 -15 CON2 -15 GE
ref2 diesel_servstn1 diesel . 36 costrow . cost
. . . caprow . capac
. lorow . . . lo
middle east_refinery 1 caprow 95 lorow 20 .
middle east_refinery 2 caprow 80 costrow 81 .
u.s.a._refinery 1 . . . 55 cost
u.s.a._refinery 2 costrow 49 . . .
refinery 1_r1 con1 1 caprow 175 .
refinery 1_r1 lorow 50 costrow 200 .
refinery 2_r2 costrow 220 caprow 100 .
refinery 2_r2 . 35 . . lo
r1_ref1 diesel caprow2 75 . . capac
r2_ref2 gas . . caprow 100 .
r2_ref2 diesel caprow2 75 . . .
ref1 gas_servstn1 gas costrow 15 caprow 70 .
ref1 gas_servstn2 gas caprow2 60 costrow 22 .
ref1 diesel_servstn1 diesel . . costrow 18 .
ref1 diesel_servstn2 diesel costrow 17 . . .
ref2 gas_servstn1 gas costrow 17 lorow 5 .
ref2 gas_servstn1 gas . . caprow2 35 .
ref2 gas_servstn2 gas . 31 . . cost
ref2 diesel_servstn2 diesel . . costrow 23 .
;
Converting from an NPSC to an LP Problem
If you have data for a linear programming program that has an embedded network, the steps required
to change that data into a form that is acceptable by PROC INTPOINT are
1. Identify the nodal ow conservation constraints. The coefcient matrix of these constraints (a
submatrix of the LPs constraint coefcient matrix) has only two nonzero elements in each
column, -1 and 1.
2. Assign a node to each nodal ow conservation constraint.
116 ! Chapter 4: The INTPOINT Procedure
3. The rhs values of conservation constraints are the corresponding nodes supplies and demands.
Use this information to create the NODEDATA= data set.
4. Assign an arc to each column of the ow conservation constraint coefcient matrix. The arc is
directed from the node associated with the row that has the 1 element in it and directed toward
to the node associated with the row that has the 1 element in it. Set up the ARCDATA= data
set that has two SAS variables. This data set could resemble ARCDATA=arcd2. These will
eventually be the TAILNODE and HEADNODE list variables when PROC INTPOINT is used.
Each observation consists of the tail and head node of each arc.
5. Remove from the data of the linear program all data concerning the nodal ow conservation
constraints.
6. Put the remaining data into a CONDATA= data set. This data set will probably resemble
CONDATA=cond4 or CONDATA=cond5.
The Sparse Format Summary
The following list illustrates possible CONDATA= data set observation sparse formats. a1, b1, b2, b3
and c1 have as a _COLUMN_ variable value either the name of an arc (possibly in the form tail_head)
or the name of a nonarc variable (if you are solving an NPSC), or the name of the LP variable (if you
are solving an LP). These are collectively referred to as variable in the tables that follow.
v If there is no TYPE list variable in the CONDATA= data set, the problem must be constrained
and there is no nonconstraint data in the CONDATA= data set:
_COLUMN_ _ROWx_ _COEFx_ _ROWy_
(no _COEFy_)
(may not be
in CONDATA)
a1 variable constraint lhs coef +------------+
a2 _TYPE_ or constraint -1 0 1 | |
TYPEOBS= | |
a3 _RHS_ or constraint rhs value | constraint |
RHSOBS= or | or |
missing | missing |
a4 _TYPE_ or constraint missing | |
TYPEOBS= | |
a5 _RHS_ or constraint missing | |
RHSOBS= or +------------+
missing
Observations of the form a4 and a5 serve no useful purpose but are still allowed to make
problem generation easier.
v If there are no ROW list variables in the data set, the problem has no constraints and the
information is nonconstraint data. There must be a TYPE list variable and only one COEF list
variable in this case. The COLUMN list variable has as values the names of arcs or nonarc
variables and must not have missing values or special row names as values:
Input Data Sets ! 117
_COLUMN_ _TYPE_ _COEFx_
b1 variable UPPERBD capacity
b2 variable LOWERBD lower flow
b3 variable COST cost
v Using a TYPE list variable for constraint data implies the following:
_COLUMN_ _TYPE_ _ROWx_ _COEFx_ _ROWy_
(no _COEFy_)
(may not be
in CONDATA)
c1 variable missing +-----+ lhs coef +------------+
c2 _TYPE_ or missing | c | -1 0 1 | |
TYPEOBS= | o | | |
c3 _RHS_ or missing | n | rhs value | constraint |
missing | s | | or |
or RHSOBS= | t | | missing |
c4 variable con type | r | lhs coef | |
c5 _RHS_ or con type | a | rhs value | |
missing | i | | |
or RHSOBS= | n | | |
c6 missing TYPE | t | -1 0 1 | |
c7 missing RHS +-----+ rhs value +------------+
If the observation is in form c4 or c5, and the _COEFx_ values are missing, the constraint is
assigned the type data specied in the _TYPE_ variable.
v Using a TYPE list variable for arc and nonarc variable data implies the following:
_COLUMN_ _TYPE_ _ROWx_ _COEFx_ _ROWy_
(no _COEFy_)
(may not be
in CONDATA)
+---------+ +---------+ +---------+
d1 variable | UPPERBD | | missing | capacity | missing |
d2 variable | LOWERBD | | or | lowerflow | or |
d3 variable | COST | | special | cost | special |
| | | row | | row |
| | | name | | name |
| | +---------+ | |
d4 missing | | | special | | |
| | | row | | |
+---------+ | name | +---------+
d5 variable missing | | value that missing
| |is interpreted
| |according to
+---------+ _ROWx_
The observations of the form d1 to d5 can have ROW list variable values. Observation d4
must have ROW list variable values. The ROW value is put into the ROW name tree so that
when dealing with observation d4 or d5, the COEF list variable value is interpreted according
118 ! Chapter 4: The INTPOINT Procedure
to the type of ROW list variable value. For example, the following three observations dene
the _ROWx_ variable values up_row, lo_row, and co_row as being an upper value bound row,
lower value bound row, and cost row, respectively:
_COLUMN_ _TYPE_ _ROWx_ _COEFx_
. UPPERBD up_row .
variable_a LOWERBD lo_row lower flow
variable_b COST co_row cost
PROC INTPOINT is now able to correctly interpret the following observation:
_COLUMN_ _TYPE_ _ROW1_ _COEF1_ _ROW2_ _COEF2_ _ROW3_ _COEF3_
var_c . up_row upval lo_row loval co_row cost
If the TYPE list variable value is a constraint type and the value of the COLUMN list variable
equals the value of the TYPEOBS= option or the default value _TYPE_, the TYPE list variable
value is ignored.
NODEDATA= Data Set
See the section Getting Started: NPSC Problems on page 63 and the section Introductory NPSC
Example on page 64 for a description of this input data set.
Output Data Sets
For NPSC problems, the procedure determines the ow that should pass through each arc as well as
the value that should be assigned to each nonarc variable. The goal is that the minimum ow bounds,
capacities, lower and upper value bounds, and side constraints are not violated. This goal is reached
when total cost incurred by such a ow pattern and value assignment is feasible and optimal. The
solution found must also conserve ow at each node.
For LP problems, the procedure determines the value that should be assigned to each variable. The
goal is that the lower and upper value bounds and the constraints are not violated. This goal is
reached when the total cost incurred by such a value assignment is feasible and optimal.
The CONOUT= data set can be produced and contains a solution obtained after performing opti-
mization.
CONOUT= Data Set
The variables in the CONOUT= data set depend on whether or not the problem has a network
component. If the problem has a network component, the variables and their possible values in an
observation are as follows:
Output Data Sets ! 119
_FROM_ a tail node of an arc. This is a missing value if an observation has
information about a nonarc variable.
_TO_ a head node of an arc. This is a missing value if an observation
has information about a nonarc variable.
_COST_ the cost of an arc or the objective function coefcient of a nonarc
variable
_CAPAC_ the capacity of an arc or upper value bound of a nonarc variable
_LO_ the lower ow bound of an arc or lower value bound of a nonarc
variable
_NAME_ a name of an arc or nonarc variable
_SUPPLY_ the supply of the tail node of the arc in the observation. This is a
missing value if an observation has information about a nonarc
variable.
_DEMAND_ the demand of the head node of the arc in the observation. This is
a missing value if an observation has information about a nonarc
variable.
_FLOW_ the ow through the arc or value of the nonarc variable
_FCOST_ ow cost, the product of _COST_ and _FLOW_
_RCOST_ the reduced cost of the arc or nonarc variable
_ANUMB_ the number of the arc (positive) or nonarc variable (nonpositive);
used for warm starting PROC NETFLOW
_TNUMB_ the number of the tail node in the network basis spanning tree;
used for warm starting PROC NETFLOW
_STATUS_ the status of the arc or nonarc variable
If the problem does not have a network component, the variables and their possible values in an
observation are as follows:
_OBJFN_ the objective function coefcient of a variable
_UPPERBD the upper value bound of a variable
_LOWERBD the lower value bound of a variable
_NAME_ the name of a variable
_VALUE_ the value of the variable
_FCOST_ objective function value for that variable; the product of _OBJFN_
and _VALUE_
The variables present in the ARCDATA= data set are present in a CONOUT= data set. For example,
if there is a variable called tail in the ARCDATA= data set and you specied the SAS variable list
from tail;
then tail is a variable in the CONOUT= data sets instead of _FROM_. Any ID list variables also
appear in the CONOUT= data sets.
120 ! Chapter 4: The INTPOINT Procedure
MPSOUT= Data Set
The MPSOUT= data set contains problem data converted from a PROC INTPOINT format into an
MPS-format SAS data set. The six elds, FIELD1 to FIELD6, in the MPSOUT= data set correspond to
the six columns in MPS standard. For more information about the MPS-format SAS data set, see
Chapter 16, The MPS-Format SAS Data Set.
Converting Any PROC INTPOINT Format to an MPS-Format SAS Data
Set
The MPSOUT= option enables you to convert an input data set for the INTPOINT procedure into an
MPS-format SAS data set. The converted data set is readable by the OPTLP procedure.
The conversion can handle linear programs and network ow formulations. If you specify a network
ow formulation, it will be converted into an equivalent linear program. When multiple objective
row names are present, rows with the name encountered rst are combined into the objective row.
The remaining rows are marked as free rows.
For information about how the contents of the MPS-format SAS data set are interpreted, see
Chapter 16, The MPS-Format SAS Data Set.
Case Sensitivity
Whenever the INTPOINT procedure has to compare character strings, whether they are node names,
arc names, nonarc names, LP variable names, or constraint names, if the two strings have different
lengths, or on a character by character basis the character is different or has different cases, PROC
INTPOINT judges the character strings to be different.
Not only is this rule enforced when one or both character strings are obtained as values of SAS
variables in PROC INTPOINTs input data sets, it also should be obeyed if one or both character
strings were originally SAS variable names, or were obtained as the values of options or statements
parsed to PROC INTPOINT. For example, if the network has only one node that has supply capability,
or if you are solving a MAXFLOW or SHORTPATH problem, you can indicate that node using the
SOURCE= option. If you specify
proc intpoint source=NotableNode
then PROC INTPOINT looks for a value of the TAILNODE list variable that is NotableNode.
Version 6 of the SAS System converts text that makes up statements into uppercase. The name of the
node searched for would be NOTABLENODE, even if this was your SAS code:
proc intpoint source=NotableNode
Loop Arcs ! 121
If you want PROC INTPOINT to behave as it did in Version 6, specify
options validvarname=v6;
If the SPARSECONDATA option is not specied, and you are running SAS software Version 6, or
you are running SAS software Version 7 onward and have specied
options validvarname=v6;
all values of the SAS variables that belong to the NAME list are uppercased. This is because the SAS
System has uppercased all SAS variable names, particularly those in the VAR list of the CONDATA=
data set.
Entities that contain blanks must be enclosed in quotes.
Loop Arcs
Loop arcs (which are arcs directed toward nodes from which they originate) are prohibited. Rather,
introduce a dummy intermediate node in loop arcs. For example, replace arc (A,A) with (A,B) and
(B,A); B is the name of a new node, and it must be distinct for each loop arc.
Multiple Arcs
Multiple arcs with the same tail and head nodes are prohibited. PROC INTPOINT checks to ensure
there are no such arcs before proceeding with the optimization. Introduce a new dummy intermediate
node in multiple arcs. This node must be distinct for each multiple arc. For example, if some network
has three arcs directed from node A toward node B, then replace one of these three with arcs (A,C)
and (C,B) and replace another one with (A,D) and (D,B). C and D are new nodes added to the network.
Flow and Value Bounds
The capacity and lower ow bound of an arc can be equal. Negative arc capacities and lower ow
bounds are permitted. If both arc capacities and lower ow bounds are negative, the lower ow
bound must be at least as negative as the capacity. An arc (A,B) that has a negative ow of units
can be interpreted as an arc that conveys units of ow from node B to node A.
The upper and lower value bound of a nonarc variable can be equal. Negative upper and lower
bounds are permitted. If both are negative, the lower bound must be at least as negative as the upper
bound.
122 ! Chapter 4: The INTPOINT Procedure
When solving an LP, the upper and lower value bounds of an LP variable can be equal. Negative
upper and lower bounds are permitted. If both are negative, the lower bound must be at least as
negative as the upper bound.
In short, for any problem to be feasible, a lower bound must be _ the associated upper bound.
Tightening Bounds and Side Constraints
If any piece of data is furnished to PROC INTPOINT more than once, PROC INTPOINT checks for
consistency so that no conict exists concerning the data values. For example, if the cost of some arc
is seen to be one value and as more data are read, the cost of the same arc is seen to be another value,
PROC INTPOINT issues an error message on the SAS log and stops. There are two exceptions to
this:
v The bounds of arcs and nonarc variables, or the bounds of LP variables, are made as tight as
possible. If several different values are given for the lower ow bound of an arc, the greatest
value is used. If several different values are given for the lower bound of a nonarc or LP
variable, the greatest value is used. If several different values are given for the capacity of an
arc, the smallest value is used. If several different values are given for the upper bound of a
nonarc or LP variable, the smallest value is used.
v Several values can be given for inequality constraint right-hand sides. For a particular con-
straint, the lowest rhs value is used for the rhs if the constraint is of less than or equal to
type. For a particular constraint, the greatest rhs value is used for the rhs if the constraint is of
greater than or equal to type.
Reasons for Infeasibility
Before optimization commences, PROC INTPOINT tests to ensure that the problem is not infeasible
by ensuring that, with respect to supplies, demands, and arc ow bounds, ow conservation can be
obeyed at each node:
v Let IN be the sum of lower ow bounds of arcs directed toward a node plus the nodes supply.
Let OUT be the sum of capacities of arcs directed from that node plus the nodes demand. If
IN exceeds OUT, not enough ow can leave the node.
v Let OUT be the sum of lower ow bounds of arcs directed from a node plus the nodes demand.
Let IN be the total capacity of arcs directed toward the node plus the nodes supply. If OUT
exceeds IN, not enough ow can arrive at the node.
Reasons why a network problem can be infeasible are similar to those previously mentioned but
apply to a set of nodes rather than for an individual node.
Consider the network illustrated in Figure 4.10.
Missing S Supply and Missing D Demand Values ! 123
Figure 4.10 An Infeasible Network
NODE_1----------------->NODE_2
/ capac=55 \
/ lo=50 \
/ \
/ \
/ \
NODE_3 NODE_4
supply=100 \ / demand=120
\ /
\ /
\ capac=62 /
\ lo=60 /
NODE_5----------------->NODE_6
The demand of NODE_4 is 120. That can never be satised because the maximal ow through arcs
(NODE_1, NODE_2) and (NODE_5, NODE_6) is 117. More specically, the implicit supply of
NODE_2 and NODE_6 is only 117, which is insufcient to satisfy the demand of other nodes (real
or implicit) in the network.
Furthermore, the lower ow bounds of arcs (NODE_1, NODE_2) and (NODE_5, NODE_6) are
greater than the ow that can reach the tail nodes of these arcs, that, by coincidence, is the total
supply of the network. The implicit demand of nodes NODE_1 and NODE_5 is 110, which is greater
than the amount of ow that can reach these nodes.
Missing S Supply and Missing D Demand Values
In some models, you may want a node to be either a supply or demand node but you want the node
to supply or demand the optimal number of ow units. To indicate that a node is such a supply node,
use a missing S value in the SUPPLY list variable in the ARCDATA= data set or the SUPDEM list
variable in the NODEDATA= data set. To indicate that a node is such a demand node, use a missing
D value in the DEMAND list variable in the ARCDATA= data set or the SUPDEM list variable in
the NODEDATA= data set.
Suppose the oil example in the section Introductory NPSC Example on page 64 is changed so that
crude oil can be obtained from either the Middle East or U.S.A. in any amounts. You should specify
that the node middle east is a supply node, but you do not want to stipulate that it supplies 100 units,
as before. The node u.s.a. should also remain a supply node, but you do not want to stipulate that it
supplies 80 units. You must specify that these nodes have missing S supply capabilities:
title 'Oil Industry Example';
title3 'Crude Oil can come from anywhere';
data miss_s;
missing S;
input _node_&$15. _sd_;
datalines;
middle east S
124 ! Chapter 4: The INTPOINT Procedure
u.s.a. S
servstn1 gas -95
servstn1 diesel -30
servstn2 gas -40
servstn2 diesel -15
;
The following PROC INTPOINT run uses the same ARCDATA= and CONDATA= data sets used in
the section Introductory NPSC Example on page 64:
proc intpoint
bytes=100000
nodedata=miss_s /
*
the supply (missing S) and
*
/
/
*
demand data
*
/
arcdata=arcd1 /
*
the arc descriptions
*
/
condata=cond1 /
*
the side constraints
*
/
conout=solution; /
*
the solution data set
*
/
run;
proc print;
var _from_ _to_ _cost_ _capac_ _lo_ _flow_ _fcost_;
sum _fcost_;
run;
The following messages appear on the SAS log:
NOTE: Number of nodes= 14 .
NOTE: All supply nodes have unspecified (.S) supply capability. Number of these
nodes= 2 .
NOTE: Number of demand nodes= 4 .
NOTE: Total supply= 0 , total demand= 180 .
NOTE: Number of arcs= 18 .
NOTE: Number of <= side constraints= 0 .
NOTE: Number of == side constraints= 2 .
NOTE: Number of >= side constraints= 2 .
NOTE: Number of side constraint coefficients= 8 .
NOTE: The following messages relate to the equivalent Linear Programming
problem solved by the Interior Point algorithm.
NOTE: Number of <= constraints= 0 .
NOTE: Number of == constraints= 17 .
NOTE: Number of >= constraints= 2 .
NOTE: Number of constraint coefficients= 48 .
NOTE: Number of variables= 20 .
NOTE: After preprocessing, number of <= constraints= 0.
NOTE: After preprocessing, number of == constraints= 5.
NOTE: After preprocessing, number of >= constraints= 2.
NOTE: The preprocessor eliminated 12 constraints from the problem.
NOTE: The preprocessor eliminated 33 constraint coefficients from the problem.
NOTE: After preprocessing, number of variables= 6.
NOTE: The preprocessor eliminated 14 variables from the problem.
Missing S Supply and Missing D Demand Values ! 125
NOTE: 6 columns, 0 rows and 6 coefficients were added to the problem to handle
unrestricted variables, variables that are split, and constraint slack or
surplus variables.
NOTE: There are 19 sub-diagonal nonzeroes in the unfactored A Atranspose matrix.
NOTE: The 7 factor nodes make up 2 supernodes
NOTE: There are 4 nonzero sub-rows or sub-columns outside the supernodal
triangular regions along the factors leading diagonal.
NOTE: Bound feasibility attained by iteration 1.
NOTE: Dual feasibility attained by iteration 1.
NOTE: Constraint feasibility attained by iteration 1.
NOTE: The Primal-Dual Predictor-Corrector Interior Point algorithm performed 6
iterations.
NOTE: Optimum reached.
NOTE: Objective= 50075.
NOTE: The data set WORK.SOLUTION has 18 observations and 10 variables.
NOTE: There were 18 observations read from the data set WORK.ARCD1.
NOTE: There were 6 observations read from the data set WORK.MISS_S.
NOTE: There were 4 observations read from the data set WORK.COND1.
The CONOUT= data set is shown in Figure 4.11.
Figure 4.11 Missing S SUPDEM Values in NODEDATA
Oil Industry Example
Crude Oil can come from anywhere
Obs _from_ _to_ _cost_ _capac_ _lo_ _FLOW_ _FCOST_
1 refinery 1 r1 200 175 50 145.000 29000.00
2 refinery 2 r2 220 100 35 35.000 7700.00
3 r1 ref1 diesel 0 75 0 36.250 0.00
4 r1 ref1 gas 0 140 0 108.750 0.00
5 r2 ref2 diesel 0 75 0 8.750 0.00
6 r2 ref2 gas 0 100 0 26.250 0.00
7 middle east refinery 1 63 95 20 20.000 1260.00
8 u.s.a. refinery 1 55 99999999 0 125.000 6875.00
9 middle east refinery 2 81 80 10 10.000 810.00
10 u.s.a. refinery 2 49 99999999 0 25.000 1225.00
11 ref1 diesel servstn1 diesel 18 99999999 0 30.000 540.00
12 ref2 diesel servstn1 diesel 36 99999999 0 0.000 0.00
13 ref1 gas servstn1 gas 15 70 0 68.750 1031.25
14 ref2 gas servstn1 gas 17 35 5 26.250 446.25
15 ref1 diesel servstn2 diesel 17 99999999 0 6.250 106.25
16 ref2 diesel servstn2 diesel 23 99999999 0 8.750 201.25
17 ref1 gas servstn2 gas 22 60 0 40.000 880.00
18 ref2 gas servstn2 gas 31 99999999 0 0.000 0.00
========
50075.00
126 ! Chapter 4: The INTPOINT Procedure
The optimal supplies of nodes middle east and u.s.a. are 30 and 150 units, respectively. For this
example, the same optimal solution is obtained if these nodes had supplies less than these values
(each supplies 1 unit, for example) and the THRUNET option was specied in the PROC INTPOINT
statement. With the THRUNET option active, when total supply exceeds total demand, the specied
nonmissing demand values are the lowest number of ow units that must be absorbed by the
corresponding node. This is demonstrated in the following PROC INTPOINT run. The missing S is
most useful when nodes are to supply optimal numbers of ow units and it turns out that for some
nodes, the optimal supply is 0.
data miss_s_x;
missing S;
input _node_&$15. _sd_;
datalines;
middle east 1
u.s.a. 1
servstn1 gas -95
servstn1 diesel -30
servstn2 gas -40
servstn2 diesel -15
;
proc intpoint
bytes=100000
thrunet
nodedata=miss_s_x /
*
No supply (missing S)
*
/
arcdata=arcd1 /
*
the arc descriptions
*
/
condata=cond1 /
*
the side constraints
*
/
conout=solution; /
*
the solution data set
*
/
run;
proc print;
var _from_ _to_ _cost_ _capac_ _lo_ _flow_ _fcost_;
sum _fcost_;
run;
The following messages appear on the SAS log. Note that the Total supply= 2, not 0 as in the last
run:
NOTE: Number of nodes= 14 .
NOTE: Number of supply nodes= 2 .
NOTE: Number of demand nodes= 4 .
NOTE: Total supply= 2 , total demand= 180 .
NOTE: Number of arcs= 18 .
NOTE: Number of <= side constraints= 0 .
NOTE: Number of == side constraints= 2 .
NOTE: Number of >= side constraints= 2 .
NOTE: Number of side constraint coefficients= 8 .
NOTE: The following messages relate to the equivalent Linear Programming problem
solved by the Interior Point algorithm.
NOTE: Number of <= constraints= 0 .
NOTE: Number of == constraints= 17 .
NOTE: Number of >= constraints= 2 .
NOTE: Number of constraint coefficients= 48 .
NOTE: Number of variables= 20 .
Balancing Total Supply and Total Demand ! 127
NOTE: After preprocessing, number of <= constraints= 0.
NOTE: After preprocessing, number of == constraints= 5.
NOTE: After preprocessing, number of >= constraints= 2.
NOTE: The preprocessor eliminated 12 constraints from the problem.
NOTE: The preprocessor eliminated 33 constraint coefficients from the problem.
NOTE: After preprocessing, number of variables= 6.
NOTE: The preprocessor eliminated 14 variables from the problem.
NOTE: 6 columns, 0 rows and 6 coefficients were added to the problem to handle
unrestricted variables, variables that are split, and constraint slack or
surplus variables.
NOTE: There are 19 sub-diagonal nonzeroes in the unfactored A Atranspose matrix.
NOTE: The 7 factor nodes make up 2 supernodes
NOTE: There are 4 nonzero sub-rows or sub-columns outside the supernodal triangular
regions along the factors leading diagonal.
NOTE: Bound feasibility attained by iteration 1.
NOTE: Dual feasibility attained by iteration 1.
NOTE: Constraint feasibility attained by iteration 1.
NOTE: The Primal-Dual Predictor-Corrector Interior Point algorithm performed 6
iterations.
NOTE: Optimum reached.
NOTE: Objective= 50075.
NOTE: The data set WORK.SOLUTION has 18 observations and 10 variables.
NOTE: There were 18 observations read from the data set WORK.ARCD1.
NOTE: There were 6 observations read from the data set WORK.MISS_S_X.
NOTE: There were 4 observations read from the data set WORK.COND1.
If total supply exceeds total demand, any missing S values are ignored. If total demand exceeds total
supply, any missing D values are ignored.
Balancing Total Supply and Total Demand
When Total Supply Exceeds Total Demand
When total supply of a network problem exceeds total demand, PROC INTPOINT adds an extra
node (called the excess node) to the problem and sets the demand at that node equal to the difference
between total supply and total demand. There are three ways that this excess node can be joined to
the network. All three ways entail PROC INTPOINT generating a set of arcs (henceforth referred to
as the generated arcs) that are directed toward the excess node. The total amount of ow in generated
arcs equals the demand of the excess node. The generated arcs originate from one of three sets of
nodes.
When you specify the THRUNET option, the set of nodes that generated arcs originate from are all de-
mand nodes, even those demand nodes with unspecied demand capability. You indicate that a node
has unspecied demand capability by using a missing D value instead of an actual value for demand
data (discussed in the section Missing S Supply and Missing D Demand Values on page 123).
128 ! Chapter 4: The INTPOINT Procedure
The value specied as the demand of a demand node is in effect a lower bound of the number
of ow units that node can actually demand. For missing D demand nodes, this lower bound is zero.
If you do not specify the THRUNET option, the way in which the excess node is joined to the
network depends on whether there are demand nodes with unspecied demand capability (nodes
with missing D demand) or not.
If there are missing D demand nodes, these nodes are the set of nodes that generated arcs originate
from. The value specied as the demand of a demand node, if not missing D, is the number of ow
units that node can actually demand. For a missing D demand node, the actual demand of that node
may be zero or greater.
If there are no missing D demand nodes, the set of nodes that generated arcs originate from are the
set of supply nodes. The value specied as the supply of a supply node is in effect an upper bound of
the number of ow units that node can actually supply. For missing S supply nodes (discussed in the
section Missing S Supply and Missing D Demand Values on page 123), this upper bound is zero,
so missing S nodes when total supply exceeds total demand are transshipment nodes, that is, nodes
that neither supply nor demand ow.
When Total Supply Is Less Than Total Demand
When total supply of a network problem is less than total demand, PROC INTPOINT adds an extra
node (called the excess node) to the problem and sets the supply at that node equal to the difference
between total demand and total supply. There are three ways that this excess node can be joined to
the network. All three ways entail PROC INTPOINT generating a set of arcs (henceforth referred to
as the generated arcs) that originate from the excess node. The total amount of ow in generated
arcs equals the supply of the excess node. The generated arcs are directed toward one of three sets of
nodes.
When you specify the THRUNET option, the set of nodes that generated arcs are directed toward
are all supply nodes, even those supply nodes with unspecied supply capability. You indicate
that a node has unspecied supply capability by using a missing S value instead of an actual value
for supply data (discussed in the section Missing S Supply and Missing D Demand Values on
page 123). The value specied as the supply of a supply node is in effect a lower bound of the
number of ow units that the node can actually supply. For missing S supply nodes, this lower bound
is zero.
If you do not specify the THRUNET option, the way in which the excess node is joined to the
network depends on whether there are supply nodes with unspecied supply capability (nodes with
missing S supply) or not.
If there are missing S supply nodes, these nodes are the set of nodes that generated arcs are directed
toward. The value specied as the supply of a supply node, if not missing S, is the number of ow
units that the node can actually supply. For a missing S supply node, the actual supply of that node
may be zero or greater.
If there are no missing S supply nodes, the set of nodes that generated arcs are directed toward are
the set of demand nodes. The value specied as the demand of a demand node is in effect an upper
bound of the number of ow units that node can actually demand. For missing D demand nodes
How to Make the Data Read of PROC INTPOINT More Efcient ! 129
(discussed in the section Missing S Supply and Missing D Demand Values on page 123), this upper
bound is zero, so missing D nodes when total supply is less than total demand are transshipment
nodes, that is, nodes that neither supply nor demand ow.
How to Make the Data Read of PROC INTPOINT More Efcient
This section contains information that is useful when you want to solve large constrained network
problems. However, much of this information is also useful if you have a large linear programming
problem. All of the options described in this section that are not directly applicable to networks
(options such as ARCS_ONLY_ARCDATA, ARC_SINGLE_OBS, NNODES=, and NARCS=) can
be specied to improve the speed at which LP data is read.
Large Constrained Network Problems
Many of the models presented to PROC INTPOINT are enormous. They can be considered large by
linear programming standards; problems with thousands, even millions, of variables and constraints.
When dealing with side constrained network programming problems, models can have not only a
linear programming component of that magnitude, but also a larger, possibly much larger, network
component.
The majority of network problems decision variables are arcs. Like an LP decision variable, an arc
has an objective function coefcient, upper and lower value bounds, and a name. Arcs can have
coefcients in constraints. Therefore, an arc is quite similar to an LP variable and places the same
memory demands on optimization software as an LP variable. But a typical network model has
many more arcs and nonarc variables than the typical LP model has variables. And arcs have tail and
head nodes. Storing and processing node names require huge amounts of memory. To make matters
worse, node names occupy memory at times when a large amount of other data should also reside in
memory.
While memory requirements are lower for a model with embedded network component compared
with the equivalent LP once optimization starts, the same is usually not true during the data read.
Even though nodal ow conservation constraints in the LP should not be specied in the constrained
network formulation, the memory requirements to read the latter are greater because each arc (unlike
an LP variable) originates at one node and is directed toward another.
Paging
PROC INTPOINT has facilities to read data when the available memory is insufcient to store all the
data at once. PROC INTPOINT does this by allocating memory for different purposes; for example,
to store an array or receive data read from an input SAS data set. After that memory has lled,
the information is written to disk and PROC INTPOINT can resume lling that memory with new
information. Often, information must be retrieved from disk so that data previously read can be
examined or checked for consistency. Sometimes, to prevent any data from being lost, or to retain
any changes made to the information in memory, the contents of the memory must be sent to disk
130 ! Chapter 4: The INTPOINT Procedure
before other information can take its place. This process of swapping information to and from disk is
called paging. Paging can be very time-consuming, so it is crucial to minimize the amount of paging
performed.
There are several steps you can take to make PROC INTPOINT read the data of network and linear
programming models more efciently, particularly when memory is scarce and the amount of paging
must be reduced. PROC INTPOINT will then be able to tackle large problems in what can be
considered reasonable amounts of time.
The Order of Observations
PROC INTPOINT is quite exible in the ways data can be supplied to it. Data can be given by
any reasonable means. PROC INTPOINT has convenient defaults that can save you work when
generating the data. There can be several ways to supply the same piece of data, and some pieces of
data can be given more than once. PROC INTPOINT reads everything, then merges it all together.
However, this exibility and convenience come at a price; PROC INTPOINT may not assume the
data has a characteristic that, if possessed by the data, could save time and memory during the data
read. Several options can indicate that the data has some exploitable characteristic.
For example, an arc cost can be specied once or several times in the ARCDATA= data set or the
CONDATA= data set, or both. Every time it is given in the ARCDATA= data set, a check is made
to ensure that the new value is the same as any corresponding value read in a previous observation
of the ARCDATA= data set. Every time it is given in the CONDATA= data set, a check is made to
ensure that the new value is the same as the value read in a previous observation of the CONDATA=
data set, or previously in the ARCDATA= data set. PROC INTPOINT would save time if it knew that
arc cost data would be encountered only once while reading the ARCDATA= data set, so performing
the time-consuming check for consistency would not be necessary. Also, if you indicate that the
CONDATA= data set contains data for constraints only, PROC INTPOINT will not expect any arc
information, so memory will not be allocated to receive such data while reading the CONDATA=
data set. This memory is used for other purposes and this might lead to a reduction in paging.
If applicable, use the ARC_SINGLE_OBS or the CON_SINGLE_OBS option, or both, and the
NON_REPLIC=COEFS specication to improve how the ARCDATA= data set and the CONDATA=
data set are read.
PROC INTPOINT allows the observations in input data sets to be in any order. However, major
time savings can result if you are prepared to order observations in particular ways. Time spent by
the SORT procedure to sort the input data sets, particularly the CONDATA= data set, may be more
than made up for when PROC INTPOINT reads them, because PROC INTPOINT has in memory
information possibly used when the previous observation was read. PROC INTPOINT can assume a
piece of data is either similar to that of the last observation read or is new. In the rst case, valuable
information such as an arc or a nonarc variable number or a constraint number is retained from the
previous observation. In the last case, checking the data with what has been read previously is not
necessary.
Even if you do not sort the CONDATA= data set, grouping observations that contain data for the
same arc or nonarc variable or the same row pays off. PROC INTPOINT establishes whether an
observation being read is similar to the observation just read.
How to Make the Data Read of PROC INTPOINT More Efcient ! 131
In practice, many input data sets for PROC INTPOINT have this characteristic, because it is natural
for data for each constraint to be grouped together (when using the dense format of the CONDATA=
data set) or data for each column to be grouped together (when using the sparse format of the
CONDATA= data set). If data for each arc or nonarc is spread over more than one observation of the
ARCDATA= data set, it is natural to group these observations together.
Use the GROUPED= option to indicate whether observations of the ARCDATA= data set, the
CONDATA= data set, or both, are grouped in a way that can be exploited during data read.
You can save time if the type data for each row appears near the top of the CONDATA= data set,
especially if it has the sparse format. Otherwise, when reading an observation, if PROC INTPOINT
does not know if a row is a constraint or special row, the data is set aside. Once the data set has been
completely read, PROC INTPOINT must reprocess the data it set aside. By then, it knows the type
of each constraint or row or, if its type was not provided, it is assumed to have a default type.
Better Memory Utilization
In order for PROC INTPOINT to make better utilization of available memory, you can specify
options that indicate the approximate size of the model. PROC INTPOINT then knows what to
expect. For example, if you indicate that the problem has no nonarc variables, PROC INTPOINT
will not allocate memory to store nonarc data. That memory is better utilized for other purposes.
Memory is often allocated to receive or store data of some type. If you indicate that the model does
not have much data of a particular type, the memory that would otherwise have been allocated to
receive or store that data can be used to receive or store data of another type.
The problem size options are as follows:
v NNODES= approximate number of nodes
v NARCS= approximate number of arcs
v NNAS= approximate number of nonarc variables or LP variables
v NCONS= approximate number of NPSC side constraints or LP constraints
v NCOEFS= approximate number of NPSC side constraint coefcients or LP constraint coef-
cients
These options will sometimes be referred to as Nxxxx= options.
You do not need to specify all these options for the model, but the more you do, the better. If you
do not specify some or all of these options, PROC INTPOINT guesses the size of the problem by
using what it already knows about the model. Sometimes PROC INTPOINT guesses the size of
the model by looking at the number of observations in the ARCDATA= and the CONDATA= data
sets. However, PROC INTPOINT uses rough rules of thumb, that typical models are proportioned
in certain ways (for example, if there are constraints, then arcs, nonarc variables, or LP variables
usually have about ve constraint coefcients). If your model has an unusual shape or structure, you
are encouraged to use these options.
132 ! Chapter 4: The INTPOINT Procedure
If you do use the options and you do not know the exact values to specify, overestimate the values.
For example, if you specify NARCS=10000 but the model has 10100 arcs, when dealing with the
last 100 arcs, PROC INTPOINT might have to page out data for 10000 arcs each time one of the
last arcs must be dealt with. Memory could have been allocated for all 10100 arcs without affecting
(much) the rest of the data read, so NARCS=10000 could be more of a hindrance than a help.
The point of these Nxxxx= options is to indicate the model size when PROC INTPOINT does not
know it. When PROC INTPOINT knows the real value, that value is used instead of Nxxxx= .
ARCS_ONLY_ARCDATA indicates that data for only arcs are in the ARCDATA= data set. Memory
would not be wasted to receive data for nonarc variables.
Use the memory usage options:
v The BYTES= option species the size of PROC INTPOINT main working memory in number
of bytes.
v The MEMREP option indicates that memory usage report is to be displayed on the SAS log.
Specifying an appropriate value for the BYTES= parameter is particularly important. Specify as
large a number as possible, but not so large a number that will cause PROC INTPOINT (that is, the
SAS System running underneath PROC INTPOINT) to run out of memory.
PROC INTPOINT reports its memory requirements on the SAS log if you specify the MEMREP
option.
Use Defaults to Reduce the Amount of Data
Use the parameters that specify default values as much as possible. For example, if there are many
arcs with the same cost value c, use DEFCOST=c for arcs that have that cost. Use missing values
in the COST variable in the ARCDATA= data set instead of c. PROC INTPOINT ignores missing
values, but must read, store, and process nonmissing values, even if they are equal to a default option
or could have been equal to a default parameter had it been specied. Sometimes, using default
parameters makes the need for some SAS variables in the ARCDATA= and the CONDATA= data
sets no longer necessary, or reduces the quantity of data that must be read. The default options are
v DEFCOST= default cost of arcs, objective function of nonarc variables or LP variables
v DEFMINFLOW= default lower ow bound of arcs, lower bound of nonarc variables or LP
variables
v DEFCAPACITY= default capacity of arcs, upper bound of nonarc variables or LP variables
v DEFCONTYPE= LE or DEFCONTYPE= <=
DEFCONTYPE= EQ or DEFCONTYPE= =
DEFCONTYPE= GE or DEFCONTYPE= >=
DEFCONTYPE=LE is the default.
How to Make the Data Read of PROC INTPOINT More Efcient ! 133
The default options themselves have defaults. For example, you do not need to specify DEFCOST=0
in the PROC INTPOINT statement. You should still have missing values in the COST variable in the
ARCDATA= data set for arcs that have zero costs.
If the network has only one supply node, one demand node, or both, use
v SOURCE= name of single node that has supply capability
v SUPPLY= the amount of supply at SOURCE
v SINK= name of single node that demands ow
v DEMAND= the amount of ow SINK demands
Do not specify that a constraint has zero right-hand-side values. That is the default. The only time it
might be practical to specify a zero rhs is in observations of the CONDATA= data set read early so
that PROC INTPOINT can infer that a row is a constraint. This could prevent coefcient data from
being put aside because PROC INTPOINT did not know the row was a constraint.
Names of Things
To cut data read time and memory requirements, reduce the number of bytes in the longest node
name, the longest arc name, the longest nonarc variable name, the longest LP variable name, and the
longest constraint name to 8 bytes or less. The longer a name, the more bytes must be stored and
compared with other names.
If an arc has no constraint coefcients, do not give it a name in the NAME list variable in the
ARCDATA= data set. Names for such arcs serve no purpose.
PROC INTPOINT can have a default name for each arc. If an arc is directed from node tailname
toward node headname, the default name for that arc is tailname_headname. If you do not want
PROC INTPOINT to use these default arc names, specify NAMECTRL=1. Otherwise, PROC
INTPOINT must use memory for storing node names and these node names must be searched often.
If you want to use the default tailname_headname name, that is, NAMECTRL=2 or NAMECTRL=3,
do not use underscores in node names. If the CONDATA has a dense format and has a variable in the
VAR list A_B_C_D, or if the value A_B_C_D is encountered as a value of the COLUMN list variable
when reading the CONDATA= data set that has the sparse format, PROC INTPOINT rst looks for
a node named A. If it nds it, it looks for a node called B_C_D. It then looks for a node with the
name A_B and possibly a node with name C_D. A search is then conducted for a node named A_B_C
and possibly a node named D is done. Underscores could have caused PROC INTPOINT to look
unnecessarily for nonexistent nodes. Searching for node names can be expensive, and the amount
of memory to store node names is often large. It might be better to assign the arc name A_B_C_D
directly to an arc by having that value as a NAME list variable value for that arc in the ARCDATA=
data set and specify NAMECTRL=1.
134 ! Chapter 4: The INTPOINT Procedure
Other Ways to Speed Up Data Reads
Arcs and nonarc variables, or LP variables, can have associated with them values or quantities that
have no bearing on the optimization. This information is given in the ARCDATA= data set in the
ID list variables. For example, in a distribution problem, information such as truck number and
drivers name can be associated with each arc. This is useful when the optimal solution saved in the
CONOUT= data set is analyzed. However, PROC INTPOINT needs to reserve memory to process
this information when data is being read. For large problems when memory is scarce, it might be
better to remove ancillary data from the ARCDATA. After PROC INTPOINT runs, use SAS software
to merge this information into the CONOUT= data set that contains the optimal solution.
Stopping Criteria
There are several reasons why PROC INTPOINT stops interior point optimization. Optimization
stops when
v the number of iteration equals MAXITERB=m
v the relative gap (duality gap,(c
T
.)) between the primal and dual objectives is smaller than
the value of the PDGAPTOL= option, and both the primal and dual problems are feasible.
Duality gap is dened in the section Interior Point Algorithmic Details on page 48.
PROC INTPOINT may stop optimization when it detects that the rate at which the complementarity
or duality gap is being reduced is too slow; that is, that there are consecutive iterations when the
complementarity or duality gap has stopped getting smaller and the infeasibilities, if nonzero, have
also stalled. Sometimes this indicates that the problem is infeasible.
The reasons to stop optimization outlined in the previous paragraph will be termed the usual stopping
conditions in the following explanation.
However, when solving some problems, especially if the problems are large, the usual stopping
criteria are inappropriate. PROC INTPOINT might stop optimizing prematurely. If it were allowed
to perform additional optimization, a better solution would be found. On other occasions, PROC
INTPOINT might do too much work. A sufciently good solution might be reached several iterations
before PROC INTPOINT eventually stops.
You can see PROC INTPOINTs progress to the optimum by specifying PRINTLEVEL2=2. PROC
INTPOINT will produce a table on the SAS log. A row of the table is generated during each iteration
and consists of values of the afne step complementarity, the complementarity of the solution for
the next iteration, the total bound infeasibility

n
i=1
infeas
bi
(see the infeas
b
array in the section
Interior Point: Upper Bounds on page 52), the total constraint infeasibility

n
i=1
infeas
ci
(see
the infeas
c
array in the section Interior Point Algorithmic Details on page 48), and the total dual
infeasibility

n
i=1
infeas
di
(see the infeas
d
array in the section Interior Point Algorithmic Details
on page 48). As optimization progresses, the values in all columns should converge to zero.
Stopping Criteria ! 135
To tailor stopping criteria to your problem, you can use two sets of parameters: the STOP_x
and the KEEPGOING_x parameters. The STOP_x parameters ( STOP_C, STOP_DG, STOP_IB,
STOP_IC, and STOP_ID) are used to test for some condition at the beginning of each iteration
and if met, to stop optimizing immediately. The KEEPGOING_x parameters ( KEEPGOING_C,
KEEPGOING_DG, KEEPGOING_IB, KEEPGOING_IC, and KEEPGOING_ID) are used when
PROC INTPOINT would ordinarily stop optimizing but does not if some conditions are not met.
For the sake of conciseness, a set of options might be referred to as the part of the option name they
have in common followed by the sufx x. For example, STOP_C, STOP_DG, STOP_IB, STOP_IC,
and STOP_ID will collectively be referred to as STOP_x.
At the beginning of each iteration, PROC INTPOINT will test whether complementarity is <=
STOP_C (provided you have specied a STOP_C parameter) and if it is, PROC INTPOINT will
stop optimizing. If the duality gap is <= STOP_DG (provided you have specied a STOP_DG
parameter), PROC INTPOINT will stop optimizing immediately. This is also true for the other
STOP_x parameters that are related to infeasibilities, STOP_IB, STOP_IC, and STOP_ID.
For example, if you want PROC INTPOINT to stop optimizing for the usual stopping conditions,
plus the additional condition, complementarity _ 100 or duality gap _ 0.001, then use
proc intpoint stop_c=100 stop_dg=0.001
If you want PROC INTPOINT to stop optimizing for the usual stopping conditions, plus the additional
condition, complementarity _ 1000 and duality gap _ 0.001 and constraint infeasibility _ 0.0001,
then use
proc intpoint
and_stop_c=1000 and_stop_dg=0.01 and_stop_ic=0.0001
Unlike the STOP_x parameters that cause PROC INTPOINT to stop optimizing when any one of
them is satised, the corresponding AND_STOP_x parameters ( AND_STOP_C, AND_STOP_DG,
AND_STOP_IB, AND_STOP_IC, and AND_STOP_ID) cause PROC INTPOINT to stop only if
all (more precisely, all that are specied) options are satised. For example, if PROC INTPOINT
should stop optimizing when
v complementarity _ 100 or duality gap _ 0.001 or
v complementarity _ 1000 and duality gap _ 0.001 and constraint infeasibility _ 0.000
then use
proc intpoint
stop_c=100 stop_dg=0.001
and_stop_c=1000 and_stop_dg=0.01 and_stop_ic=0.0001
136 ! Chapter 4: The INTPOINT Procedure
Just as the STOP_x parameters have AND_STOP_x partners, the KEEPGOING_x parameters
have AND_KEEPGOING_x partners. The role of the KEEPGOING_x and AND_KEEPGOING_x
parameters is to prevent optimization from stopping too early, even though a usual stopping criteria
is met.
When PROC INTPOINT detects that it should stop optimizing for a usual stopping condition, it will
perform the following tests:
v It will test whether complementarity is > KEEPGOING_C (provided you have specied a
KEEPGOING_C parameter), and if it is, PROC INTPOINT will perform more optimization.
v Otherwise, PROC INTPOINT will then test whether the primal-dual gap is > KEEPGO-
ING_DG (provided you have specied a KEEPGOING_DG parameter), and if it is, PROC
INTPOINT will perform more optimization.
v Otherwise, PROC INTPOINT will then test whether the total bound infeasibility

n
i=1
infeas
bi
> KEEPGOING_IB (provided you have specied a KEEPGOING_IB
parameter), and if it is, PROC INTPOINT will perform more optimization.
v Otherwise, PROC INTPOINT will then test whether the total constraint infeasibility

n
i=1
infeas
ci
> KEEPGOING_IC (provided you have specied a KEEPGOING_IC pa-
rameter), and if it is, PROC INTPOINT will perform more optimization.
v Otherwise, PROC INTPOINT will then test whether the total dual infeasibility

n
i=1
infeas
di
> KEEPGOING_ID (provided you have specied a KEEPGOING_ID
parameter), and if it is, PROC INTPOINT will perform more optimization.
v Otherwise it will test whether complementarity is > AND_KEEPGOING_C (provided
you have specied an AND_KEEPGOING_C parameter), and the primal-dual gap is
> AND_KEEPGOING_DG (provided you have specied an AND_KEEPGOING_DG
parameter), and the total bound infeasibility

n
i=1
infeas
bi
> AND_KEEPGOING_IB
(provided you have specied an AND_KEEPGOING_IB parameter), and the total con-
straint infeasibility

n
i=1
infeas
ci
> AND_KEEPGOING_IC (provided you have specied
an AND_KEEPGOING_IC parameter), and the total dual infeasibility

n
i=1
infeas
di
>
AND_KEEPGOING_ID (provided you have specied an AND_KEEPGOING_ID parameter),
and if it is, PROC INTPOINT will perform more optimization.
If all these tests to decide whether more optimization should be performed are false, optimization is
stopped.
The following PROC INTPOINT example is used to illustrate how several stopping criteria options
can be used together:
proc intpoint
stop_c=1000
and_stop_c=2000 and_stop_dg=0.01
and_stop_ib=1 and_stop_ic=1 and_stop_id=1
keepgoing_c=1500
and_keepgoing_c=2500 and_keepgoing_dg=0.05
and_keepgoing_ib=1 and_keepgoing_ic=1 and_keepgoing_id=1
Examples: INTPOINT Procedure ! 137
At the beginning of each iteration, PROC INTPOINT will stop optimizing if
v complementarity _ 1000 or
v complementarity _ 2000 and duality gap _ 0.01 and the total bound, constraint, and dual
infeasibilities are each _ 1
When PROC INTPOINT determines it should stop optimizing because a usual stopping condition is
met, it will stop optimizing only if
v complementarity _ 1500 or
v complementarity _ 2500 and duality gap _ 0.05 and the total bound, constraint, and dual
infeasibilities are each _ 1
Examples: INTPOINT Procedure
The following examples illustrate some of the capabilities of PROC INTPOINT. These examples,
together with the other SAS/OR examples, can be found in the SAS sample library.
In order to illustrate variations in the use of the INTPOINT procedure, Example 4.1 through
Example 4.5 use data from a company that produces two sizes of televisions. The company makes
televisions with a diagonal screen measurement of either 19 inches or 25 inches. These televisions
are made between March and May at both of the companys two factories. Each factory has a limit
on the total number of televisions of each screen dimension that can be made during those months.
The televisions are distributed to one of two shops, stored at the factory where they were made,
and sold later or shipped to the other factory. Some sets can be used to ll backorders from the
previous months. Each shop demands a number of each type of TV for the months of March through
May. The following network in Figure 4.12 illustrates the model. Arc costs can be interpreted as
production costs, storage costs, backorder penalty costs, inter-factory transportation costs, and sales
prots. The arcs can have capacities and lower ow bounds.
138 ! Chapter 4: The INTPOINT Procedure
Figure 4.12 TV Problem
Production
Inventory and
Backorders
Inter-factory
Distribution
fact2
f2_may
f2_apl
f2_mar
fact1
f1_may
f1_apl
f1_mar
shop2
shop1
There are two similarly structured networks, one for the 19-inch televisions and the other for the
25-inch screen TVs. The minimum cost production, inventory, and distribution plan for both TV
types can be determined in the same run of PROC INTPOINT. To ensure that node names are
unambiguous, the names of nodes in the 19-inch network have sufx _1, and the node names in the
25-inch network have sufx _2.
Example 4.1: Production, Inventory, Distribution Problem ! 139
Example 4.1: Production, Inventory, Distribution Problem
The following code shows how to save a specic problems data in data sets and solve the model
with PROC INTPOINT.
title 'Production Planning/Inventory/Distribution';
title2 'Minimum Cost Flow problem';
title3;
data node0;
input _node_ $ _supdem_ ;
datalines;
fact1_1 1000
fact2_1 850
fact1_2 1000
fact2_2 1500
shop1_1 -900
shop2_1 -900
shop1_2 -900
shop2_2 -1450
;
data arc0;
input _tail_ $ _head_ $ _cost_ _capac_ _lo_ diagonal factory
key_id $10. mth_made $ _name_&$17. ;
datalines;
fact1_1 f1_mar_1 127.9 500 50 19 1 production March prod f1 19 mar
fact1_1 f1_apr_1 78.6 600 50 19 1 production April prod f1 19 apl
fact1_1 f1_may_1 95.1 400 50 19 1 production May .
f1_mar_1 f1_apr_1 15 50 . 19 1 storage March .
f1_apr_1 f1_may_1 12 50 . 19 1 storage April .
f1_apr_1 f1_mar_1 28 20 . 19 1 backorder April back f1 19 apl
f1_may_1 f1_apr_1 28 20 . 19 1 backorder May back f1 19 may
f1_mar_1 f2_mar_1 11 . . 19 . f1_to_2 March .
f1_apr_1 f2_apr_1 11 . . 19 . f1_to_2 April .
f1_may_1 f2_may_1 16 . . 19 . f1_to_2 May .
f1_mar_1 shop1_1 -327.65 250 . 19 1 sales March .
f1_apr_1 shop1_1 -300 250 . 19 1 sales April .
f1_may_1 shop1_1 -285 250 . 19 1 sales May .
f1_mar_1 shop2_1 -362.74 250 . 19 1 sales March .
f1_apr_1 shop2_1 -300 250 . 19 1 sales April .
f1_may_1 shop2_1 -245 250 . 19 1 sales May .
fact2_1 f2_mar_1 88.0 450 35 19 2 production March prod f2 19 mar
fact2_1 f2_apr_1 62.4 480 35 19 2 production April prod f2 19 apl
fact2_1 f2_may_1 133.8 250 35 19 2 production May .
f2_mar_1 f2_apr_1 18 30 . 19 2 storage March .
f2_apr_1 f2_may_1 20 30 . 19 2 storage April .
f2_apr_1 f2_mar_1 17 15 . 19 2 backorder April back f2 19 apl
f2_may_1 f2_apr_1 25 15 . 19 2 backorder May back f2 19 may
f2_mar_1 f1_mar_1 10 40 . 19 . f2_to_1 March .
f2_apr_1 f1_apr_1 11 40 . 19 . f2_to_1 April .
f2_may_1 f1_may_1 13 40 . 19 . f2_to_1 May .
140 ! Chapter 4: The INTPOINT Procedure
f2_mar_1 shop1_1 -297.4 250 . 19 2 sales March .
f2_apr_1 shop1_1 -290 250 . 19 2 sales April .
f2_may_1 shop1_1 -292 250 . 19 2 sales May .
f2_mar_1 shop2_1 -272.7 250 . 19 2 sales March .
f2_apr_1 shop2_1 -312 250 . 19 2 sales April .
f2_may_1 shop2_1 -299 250 . 19 2 sales May .
fact1_2 f1_mar_2 217.9 400 40 25 1 production March prod f1 25 mar
fact1_2 f1_apr_2 174.5 550 50 25 1 production April prod f1 25 apl
fact1_2 f1_may_2 133.3 350 40 25 1 production May .
f1_mar_2 f1_apr_2 20 40 . 25 1 storage March .
f1_apr_2 f1_may_2 18 40 . 25 1 storage April .
f1_apr_2 f1_mar_2 32 30 . 25 1 backorder April back f1 25 apl
f1_may_2 f1_apr_2 41 15 . 25 1 backorder May back f1 25 may
f1_mar_2 f2_mar_2 23 . . 25 . f1_to_2 March .
f1_apr_2 f2_apr_2 23 . . 25 . f1_to_2 April .
f1_may_2 f2_may_2 26 . . 25 . f1_to_2 May .
f1_mar_2 shop1_2 -559.76 . . 25 1 sales March .
f1_apr_2 shop1_2 -524.28 . . 25 1 sales April .
f1_may_2 shop1_2 -475.02 . . 25 1 sales May .
f1_mar_2 shop2_2 -623.89 . . 25 1 sales March .
f1_apr_2 shop2_2 -549.68 . . 25 1 sales April .
f1_may_2 shop2_2 -460.00 . . 25 1 sales May .
fact2_2 f2_mar_2 182.0 650 35 25 2 production March prod f2 25 mar
fact2_2 f2_apr_2 196.7 680 35 25 2 production April prod f2 25 apl
fact2_2 f2_may_2 201.4 550 35 25 2 production May .
f2_mar_2 f2_apr_2 28 50 . 25 2 storage March .
f2_apr_2 f2_may_2 38 50 . 25 2 storage April .
f2_apr_2 f2_mar_2 31 15 . 25 2 backorder April back f2 25 apl
f2_may_2 f2_apr_2 54 15 . 25 2 backorder May back f2 25 may
f2_mar_2 f1_mar_2 20 25 . 25 . f2_to_1 March .
f2_apr_2 f1_apr_2 21 25 . 25 . f2_to_1 April .
f2_may_2 f1_may_2 43 25 . 25 . f2_to_1 May .
f2_mar_2 shop1_2 -567.83 500 . 25 2 sales March .
f2_apr_2 shop1_2 -542.19 500 . 25 2 sales April .
f2_may_2 shop1_2 -461.56 500 . 25 2 sales May .
f2_mar_2 shop2_2 -542.83 500 . 25 2 sales March .
f2_apr_2 shop2_2 -559.19 500 . 25 2 sales April .
f2_may_2 shop2_2 -489.06 500 . 25 2 sales May .
;
proc intpoint
bytes=1000000
printlevel2=2
nodedata=node0
arcdata=arc0
conout=arc1;
run;
proc print data=arc1 width=min;
var _tail_ _head_ _cost_ _capac_ _lo_ _flow_ _fcost_
diagonal factory key_id mth_made;
sum _fcost_;
run;
The following notes appear on the SAS log:
Example 4.1: Production, Inventory, Distribution Problem ! 141
NOTE: Number of nodes= 20 .
NOTE: Number of supply nodes= 4 .
NOTE: Number of demand nodes= 4 .
NOTE: Total supply= 4350 , total demand= 4150 .
NOTE: Number of arcs= 64 .
NOTE: The following messages relate to the equivalent Linear Programming problem
solved by the Interior Point algorithm.
NOTE: Number of <= constraints= 0 .
NOTE: Number of == constraints= 21 .
NOTE: Number of >= constraints= 0 .
NOTE: Number of constraint coefficients= 136 .
NOTE: Number of variables= 68 .
NOTE: After preprocessing, number of <= constraints= 0.
NOTE: After preprocessing, number of == constraints= 20.
NOTE: After preprocessing, number of >= constraints= 0.
NOTE: The preprocessor eliminated 1 constraints from the problem.
NOTE: The preprocessor eliminated 9 constraint coefficients from the problem.
NOTE: 0 columns, 0 rows and 0 coefficients were added to the problem to handle
unrestricted variables, variables that are split, and constraint slack or
surplus variables.
NOTE: There are 48 sub-diagonal nonzeroes in the unfactored A Atranspose matrix.
NOTE: The 20 factor nodes make up 8 supernodes
NOTE: There are 27 nonzero sub-rows or sub-columns outside the supernodal triangular
regions along the factors leading diagonal.
Iter Complem_aff Complem-ity Duality_gap Tot_infeasb Tot_infeasc Tot_infeasd
0 -1.000000 192857968 0.895105 66024 25664 0
1 37620673 24479828 0.919312 4575.155540 1778.391068 0
2 4392127 1833947 0.594993 0 0 0
3 654204 426961 0.249790 0 0 0
4 161214 108340 0.075186 0 0 0
5 50985 43146 0.030894 0 0 0
6 37774 34993 0.025167 0 0 0
7 17695 9774.172272 0.007114 0 0 0
8 2421.777663 1427.435257 0.001042 0 0 0
9 522.394743 240.454270 0.000176 0 0 0
10 57.447587 7.581156 0.000005540 0 0 0
11 0.831035 0.007569 5.5317109E-9 0 0 0
NOTE: The Primal-Dual Predictor-Corrector Interior Point algorithm performed 11
iterations.
NOTE: Optimum reached.
NOTE: Objective= -1281110.338.
NOTE: The data set WORK.ARC1 has 64 observations and 14 variables.
NOTE: There were 64 observations read from the data set WORK.ARC0.
NOTE: There were 8 observations read from the data set WORK.NODE0.
The solution is given in the CONOUT=arc1 data sets. In the CONOUT= data set, shown in
Output 4.1.1, the variables diagonal, factory, key_id, and mth_made form an implicit ID list. The
diagonal variable has one of two values, 19 or 25. factory also has one of two values, 1 or 2, to denote
the factory where either production or storage occurs, from where TVs are either sold to shops or
used to satisfy backorders. production, storage, sales, and backorder are values of the key_id variable.
142 ! Chapter 4: The INTPOINT Procedure
Other values of this variable, f1_to_2 and f2_to_1, are used when ow through arcs represents the
transportation of TVs between factories. The mth_made variable has values March, April, and May, the
months when TVs that are modeled as ow through an arc were made (assuming that no televisions
are stored for more than one month and none manufactured in May are used to ll March backorders).
These ID variables can be used after the PROC INTPOINT run to produce reports and perform
analysis on particular parts of the companys operation. For example, reports can be generated for
production numbers for each factory; optimal sales gures for each shop; and how many TVs should
be stored, used to ll backorders, sent to the other factory, or any combination of these, for TVs with
a particular screen, those produced in a particular month, or both.
Output 4.1.1 CONOUT=ARC1
Production Planning/Inventory/Distribution
Minimum Cost Flow problem
d m
_ _ i f t
_ _ _ c _ F a a k h
t h c a F C g c e _
a e o p _ L O o t y m
O i a s a l O S n o _ a
b l d t c o W T a r i d
s _ _ _ _ _ _ _ l y d e
1 fact1_1 f1_apr_1 78.6 600 50 600.000 47160.00 19 1 production April
2 f1_mar_1 f1_apr_1 15.0 50 0 0.000 0.00 19 1 storage March
3 f1_may_1 f1_apr_1 28.0 20 0 0.000 0.00 19 1 backorder May
4 f2_apr_1 f1_apr_1 11.0 40 0 0.000 0.00 19 . f2_to_1 April
5 fact1_2 f1_apr_2 174.5 550 50 550.000 95975.00 25 1 production April
6 f1_mar_2 f1_apr_2 20.0 40 0 0.000 0.00 25 1 storage March
7 f1_may_2 f1_apr_2 41.0 15 0 15.000 615.00 25 1 backorder May
8 f2_apr_2 f1_apr_2 21.0 25 0 0.000 0.00 25 . f2_to_1 April
9 fact1_1 f1_mar_1 127.9 500 50 344.999 44125.43 19 1 production March
10 f1_apr_1 f1_mar_1 28.0 20 0 20.000 560.00 19 1 backorder April
11 f2_mar_1 f1_mar_1 10.0 40 0 40.000 400.00 19 . f2_to_1 March
12 fact1_2 f1_mar_2 217.9 400 40 400.000 87160.00 25 1 production March
13 f1_apr_2 f1_mar_2 32.0 30 0 30.000 960.00 25 1 backorder April
14 f2_mar_2 f1_mar_2 20.0 25 0 25.000 500.00 25 . f2_to_1 March
15 fact1_1 f1_may_1 95.1 400 50 50.001 4755.06 19 1 production May
16 f1_apr_1 f1_may_1 12.0 50 0 50.000 600.00 19 1 storage April
17 f2_may_1 f1_may_1 13.0 40 0 0.000 0.00 19 . f2_to_1 May
18 fact1_2 f1_may_2 133.3 350 40 40.000 5332.04 25 1 production May
19 f1_apr_2 f1_may_2 18.0 40 0 0.000 0.00 25 1 storage April
20 f2_may_2 f1_may_2 43.0 25 0 0.000 0.00 25 . f2_to_1 May
21 f1_apr_1 f2_apr_1 11.0 99999999 0 30.000 330.00 19 . f1_to_2 April
22 fact2_1 f2_apr_1 62.4 480 35 480.000 29952.00 19 2 production April
23 f2_mar_1 f2_apr_1 18.0 30 0 0.000 0.00 19 2 storage March
24 f2_may_1 f2_apr_1 25.0 15 0 0.000 0.00 19 2 backorder May
25 f1_apr_2 f2_apr_2 23.0 99999999 0 0.000 0.00 25 . f1_to_2 April
26 fact2_2 f2_apr_2 196.7 680 35 680.000 133755.99 25 2 production April
27 f2_mar_2 f2_apr_2 28.0 50 0 0.000 0.00 25 2 storage March
28 f2_may_2 f2_apr_2 54.0 15 0 15.000 810.00 25 2 backorder May
Example 4.1: Production, Inventory, Distribution Problem ! 143
Output 4.1.1 continued
Production Planning/Inventory/Distribution
Minimum Cost Flow problem
d m
_ _ i f t
_ _ _ c _ F a a k h
t h c a F C g c e _
a e o p _ L O o t y m
O i a s a l O S n o _ a
b l d t c o W T a r i d
s _ _ _ _ _ _ _ l y d e
29 f1_mar_1 f2_mar_1 11.0 99999999 0 0.000 0.00 19 . f1_to_2 March
30 fact2_1 f2_mar_1 88.0 450 35 290.000 25520.00 19 2 production March
31 f2_apr_1 f2_mar_1 17.0 15 0 0.000 0.00 19 2 backorder April
32 f1_mar_2 f2_mar_2 23.00 99999999 0 0.000 0.00 25 . f1_to_2 March
33 fact2_2 f2_mar_2 182.00 650 35 645.000 117389.96 25 2 production March
34 f2_apr_2 f2_mar_2 31.00 15 0 0.000 0.00 25 2 backorder April
35 f1_may_1 f2_may_1 16.00 99999999 0 100.000 1600.01 19 . f1_to_2 May
36 fact2_1 f2_may_1 133.80 250 35 35.000 4683.00 19 2 production May
37 f2_apr_1 f2_may_1 20.00 30 0 15.000 299.99 19 2 storage April
38 f1_may_2 f2_may_2 26.00 99999999 0 0.000 0.00 25 . f1_to_2 May
39 fact2_2 f2_may_2 201.40 550 35 35.000 7049.00 25 2 production May
40 f2_apr_2 f2_may_2 38.00 50 0 0.000 0.00 25 2 storage April
41 f1_mar_1 shop1_1 -327.65 250 0 154.999 -50785.56 19 1 sales March
42 f1_apr_1 shop1_1 -300.00 250 0 250.000 -75000.00 19 1 sales April
43 f1_may_1 shop1_1 -285.00 250 0 0.000 0.00 19 1 sales May
44 f2_mar_1 shop1_1 -297.40 250 0 250.000 -74349.99 19 2 sales March
45 f2_apr_1 shop1_1 -290.00 250 0 245.001 -71050.17 19 2 sales April
46 f2_may_1 shop1_1 -292.00 250 0 0.000 0.00 19 2 sales May
47 f1_mar_2 shop1_2 -559.76 99999999 0 0.000 0.00 25 1 sales March
48 f1_apr_2 shop1_2 -524.28 99999999 0 0.000 -0.01 25 1 sales April
49 f1_may_2 shop1_2 -475.02 99999999 0 25.000 -11875.64 25 1 sales May
50 f2_mar_2 shop1_2 -567.83 500 0 500.000 -283915.00 25 2 sales March
51 f2_apr_2 shop1_2 -542.19 500 0 375.000 -203321.08 25 2 sales April
52 f2_may_2 shop1_2 -461.56 500 0 0.000 0.00 25 2 sales May
53 f1_mar_1 shop2_1 -362.74 250 0 250.000 -90685.00 19 1 sales March
54 f1_apr_1 shop2_1 -300.00 250 0 250.000 -75000.00 19 1 sales April
55 f1_may_1 shop2_1 -245.00 250 0 0.000 0.00 19 1 sales May
56 f2_mar_1 shop2_1 -272.70 250 0 0.000 0.00 19 2 sales March
57 f2_apr_1 shop2_1 -312.00 250 0 250.000 -78000.00 19 2 sales April
58 f2_may_1 shop2_1 -299.00 250 0 150.000 -44850.00 19 2 sales May
59 f1_mar_2 shop2_2 -623.89 99999999 0 455.000 -283869.94 25 1 sales March
60 f1_apr_2 shop2_2 -549.68 99999999 0 535.000 -294078.78 25 1 sales April
61 f1_may_2 shop2_2 -460.00 99999999 0 0.000 0.00 25 1 sales May
62 f2_mar_2 shop2_2 -542.83 500 0 120.000 -65139.47 25 2 sales March
63 f2_apr_2 shop2_2 -559.19 500 0 320.000 -178940.96 25 2 sales April
64 f2_may_2 shop2_2 -489.06 500 0 20.000 -9781.20 25 2 sales May
===========
-1281110.34
144 ! Chapter 4: The INTPOINT Procedure
Example 4.2: Altering Arc Data
This example examines the effect of changing some of the arc costs. The backorder penalty costs are
increased by 20 percent. The sales prot of 25-inch TVs sent to the shops in May is increased by 30
units. The backorder penalty costs of 25-inch TVs manufactured in May for April consumption is
decreased by 30 units. The production cost of 19-inch and 25-inch TVs made in May are decreased
by 5 units and 20 units, respectively. How does the optimal solution of the network after these arc
cost alterations compare with the optimum of the original network?
These SAS statements produce the new NODEDATA= and ARCDATA= data sets:
title2 'Minimum Cost Flow problem- Altered Arc Data';
data arc2;
set arc1;
oldcost=_cost_;
oldfc=_fcost_;
oldflow=_flow_;
if key_id='backorder'
then _cost_=_cost_
*
1.2;
else if _tail_='f2_may_2' then _cost_=_cost_-30;
if key_id='production' & mth_made='May' then
if diagonal=19 then _cost_=_cost_-5;
else _cost_=_cost_-20;
run;
proc intpoint
bytes=100000
printlevel2=2
nodedata=node0
arcdata=arc2
conout=arc3;
run;
proc print data=arc3;
var _tail_ _head_ _capac_ _lo_ _supply_ _demand_ _name_
_cost_ _flow_ _fcost_ oldcost oldflow oldfc
diagonal factory key_id mth_made;
/
*
to get this variable order
*
/
sum oldfc _fcost_;
run;
The following notes appear on the SAS log:
Example 4.2: Altering Arc Data ! 145
NOTE: Number of nodes= 20 .
NOTE: Number of supply nodes= 4 .
NOTE: Number of demand nodes= 4 .
NOTE: Total supply= 4350 , total demand= 4150 .
NOTE: Number of arcs= 64 .
NOTE: The following messages relate to the equivalent Linear Programming problem
solved by the Interior Point algorithm.
NOTE: Number of <= constraints= 0 .
NOTE: Number of == constraints= 21 .
NOTE: Number of >= constraints= 0 .
NOTE: Number of constraint coefficients= 136 .
NOTE: Number of variables= 68 .
NOTE: After preprocessing, number of <= constraints= 0.
NOTE: After preprocessing, number of == constraints= 20.
NOTE: After preprocessing, number of >= constraints= 0.
NOTE: The preprocessor eliminated 1 constraints from the problem.
NOTE: The preprocessor eliminated 9 constraint coefficients from the problem.
NOTE: 0 columns, 0 rows and 0 coefficients were added to the problem to handle
unrestricted variables, variables that are split, and constraint slack or
surplus variables.
NOTE: There are 48 sub-diagonal nonzeroes in the unfactored A Atranspose matrix.
NOTE: The 20 factor nodes make up 8 supernodes
NOTE: There are 27 nonzero sub-rows or sub-columns outside the supernodal triangular
regions along the factors leading diagonal.
Iter Complem_aff Complem-ity Duality_gap Tot_infeasb Tot_infeasc Tot_infeasd
0 -1.000000 193775969 0.894415 66024 25664 0
1 37797544 24594220 0.918149 4566.893212 1775.179450 0
2 4408681 1844606 0.590964 0 0 0
3 347168 312126 0.194113 0 0 0
4 145523 86002 0.060330 0 0 0
5 43008 38240 0.027353 0 0 0
6 31097 21145 0.015282 0 0 0
7 9308.807034 4158.399675 0.003029 0 0 0
8 1710.832075 752.174595 0.000549 0 0 0
9 254.197112 47.755299 0.000034846 0 0 0
10 5.252560 0.010692 7.8017564E-9 0 0 0
NOTE: The Primal-Dual Predictor-Corrector Interior Point algorithm performed 10
iterations.
NOTE: Optimum reached.
NOTE: Objective= -1285086.442.
NOTE: The data set WORK.ARC3 has 64 observations and 17 variables.
NOTE: There were 64 observations read from the data set WORK.ARC2.
NOTE: There were 8 observations read from the data set WORK.NODE0.
The solution is displayed in Output 4.2.1.
146 ! Chapter 4: The INTPOINT Procedure
Output 4.2.1 CONOUT=ARC3
Production Planning/Inventory/Distribution
Minimum Cost Flow problem- Altered arc data
_tail_ _head_ _capac_ _lo_ _SUPPLY_ _DEMAND_ _name_ _cost_ _FLOW_
fact1_1 f1_apr_1 600 50 1000 . prod f1 19 apl 78.60 540.000
f1_mar_1 f1_apr_1 50 0 . . 15.00 0.000
f1_may_1 f1_apr_1 20 0 . . back f1 19 may 33.60 0.000
f2_apr_1 f1_apr_1 40 0 . . 11.00 0.000
fact1_2 f1_apr_2 550 50 1000 . prod f1 25 apl 174.50 250.000
f1_mar_2 f1_apr_2 40 0 . . 20.00 0.000
f1_may_2 f1_apr_2 15 0 . . back f1 25 may 49.20 15.000
f2_apr_2 f1_apr_2 25 0 . . 21.00 0.000
fact1_1 f1_mar_1 500 50 1000 . prod f1 19 mar 127.90 340.000
f1_apr_1 f1_mar_1 20 0 . . back f1 19 apl 33.60 20.000
f2_mar_1 f1_mar_1 40 0 . . 10.00 40.000
fact1_2 f1_mar_2 400 40 1000 . prod f1 25 mar 217.90 400.000
f1_apr_2 f1_mar_2 30 0 . . back f1 25 apl 38.40 30.000
f2_mar_2 f1_mar_2 25 0 . . 20.00 25.000
fact1_1 f1_may_1 400 50 1000 . 90.10 115.000
f1_apr_1 f1_may_1 50 0 . . 12.00 0.000
f2_may_1 f1_may_1 40 0 . . 13.00 0.000
fact1_2 f1_may_2 350 40 1000 . 113.30 350.000
f1_apr_2 f1_may_2 40 0 . . 18.00 0.000
f2_may_2 f1_may_2 25 0 . . 13.00 0.000
f1_apr_1 f2_apr_1 99999999 0 . . 11.00 20.000
fact2_1 f2_apr_1 480 35 850 . prod f2 19 apl 62.40 480.000
f2_mar_1 f2_apr_1 30 0 . . 18.00 0.000
f2_may_1 f2_apr_1 15 0 . . back f2 19 may 30.00 0.000
f1_apr_2 f2_apr_2 99999999 0 . . 23.00 0.000
fact2_2 f2_apr_2 680 35 1500 . prod f2 25 apl 196.70 680.000
f2_mar_2 f2_apr_2 50 0 . . 28.00 0.000
f2_may_2 f2_apr_2 15 0 . . back f2 25 may 64.80 0.000
f1_mar_1 f2_mar_1 99999999 0 . . 11.00 0.000
fact2_1 f2_mar_1 450 35 850 . prod f2 19 mar 88.00 290.000
f2_apr_1 f2_mar_1 15 0 . . back f2 19 apl 20.40 0.000
f1_mar_2 f2_mar_2 99999999 0 . . 23.00 0.000
fact2_2 f2_mar_2 650 35 1500 . prod f2 25 mar 182.00 635.000
f2_apr_2 f2_mar_2 15 0 . . back f2 25 apl 37.20 0.000
f1_may_1 f2_may_1 99999999 0 . . 16.00 115.000
fact2_1 f2_may_1 250 35 850 . 128.80 35.000
f2_apr_1 f2_may_1 30 0 . . 20.00 0.000
f1_may_2 f2_may_2 99999999 0 . . 26.00 335.000
fact2_2 f2_may_2 550 35 1500 . 181.40 35.000
f2_apr_2 f2_may_2 50 0 . . 38.00 0.000
f1_mar_1 shop1_1 250 0 . 900 -327.65 150.000
f1_apr_1 shop1_1 250 0 . 900 -300.00 250.000
f1_may_1 shop1_1 250 0 . 900 -285.00 0.000
f2_mar_1 shop1_1 250 0 . 900 -297.40 250.000
f2_apr_1 shop1_1 250 0 . 900 -290.00 250.000
Example 4.2: Altering Arc Data ! 147
Output 4.2.1 continued
Production Planning/Inventory/Distribution
Minimum Cost Flow problem- Altered arc data
_tail_ _head_ _capac_ _lo_ _SUPPLY_ _DEMAND_ _name_ _cost_ _FLOW_
f2_may_1 shop1_1 250 0 . 900 -292.00 0.000
f1_mar_2 shop1_2 99999999 0 . 900 -559.76 0.000
f1_apr_2 shop1_2 99999999 0 . 900 -524.28 0.000
f1_may_2 shop1_2 99999999 0 . 900 -475.02 0.000
f2_mar_2 shop1_2 500 0 . 900 -567.83 500.000
f2_apr_2 shop1_2 500 0 . 900 -542.19 400.000
f2_may_2 shop1_2 500 0 . 900 -491.56 0.000
f1_mar_1 shop2_1 250 0 . 900 -362.74 250.000
f1_apr_1 shop2_1 250 0 . 900 -300.00 250.000
f1_may_1 shop2_1 250 0 . 900 -245.00 0.000
f2_mar_1 shop2_1 250 0 . 900 -272.70 0.000
f2_apr_1 shop2_1 250 0 . 900 -312.00 250.000
f2_may_1 shop2_1 250 0 . 900 -299.00 150.000
f1_mar_2 shop2_2 99999999 0 . 1450 -623.89 455.000
f1_apr_2 shop2_2 99999999 0 . 1450 -549.68 235.000
f1_may_2 shop2_2 99999999 0 . 1450 -460.00 0.000
f2_mar_2 shop2_2 500 0 . 1450 -542.83 110.000
f2_apr_2 shop2_2 500 0 . 1450 -559.19 280.000
f2_may_2 shop2_2 500 0 . 1450 -519.06 370.000
148 ! Chapter 4: The INTPOINT Procedure
Production Planning/Inventory/Distribution
Minimum Cost Flow problem- Altered arc data
Obs _FCOST_ oldcost oldflow oldfc diagonal factory key_id mth_made
1 42444.01 78.60 600.000 47160.00 19 1 production April
2 0.00 15.00 0.000 0.00 19 1 storage March
3 0.00 28.00 0.000 0.00 19 1 backorder May
4 0.00 11.00 0.000 0.00 19 . f2_to_1 April
5 43625.00 174.50 550.000 95975.00 25 1 production April
6 0.00 20.00 0.000 0.00 25 1 storage March
7 738.00 41.00 15.000 615.00 25 1 backorder May
8 0.00 21.00 0.000 0.00 25 . f2_to_1 April
9 43486.02 127.90 344.999 44125.43 19 1 production March
10 672.00 28.00 20.000 560.00 19 1 backorder April
11 400.00 10.00 40.000 400.00 19 . f2_to_1 March
12 87160.00 217.90 400.000 87160.00 25 1 production March
13 1152.00 32.00 30.000 960.00 25 1 backorder April
14 500.00 20.00 25.000 500.00 25 . f2_to_1 March
15 10361.47 95.10 50.001 4755.06 19 1 production May
16 0.00 12.00 50.000 600.00 19 1 storage April
17 0.00 13.00 0.000 0.00 19 . f2_to_1 May
18 39655.00 133.30 40.000 5332.04 25 1 production May
19 0.00 18.00 0.000 0.00 25 1 storage April
20 0.00 43.00 0.000 0.00 25 . f2_to_1 May
21 220.00 11.00 30.000 330.00 19 . f1_to_2 April
22 29952.00 62.40 480.000 29952.00 19 2 production April
23 0.00 18.00 0.000 0.00 19 2 storage March
24 0.00 25.00 0.000 0.00 19 2 backorder May
25 0.00 23.00 0.000 0.00 25 . f1_to_2 April
26 133755.99 196.70 680.000 133755.99 25 2 production April
27 0.00 28.00 0.000 0.00 25 2 storage March
28 0.00 54.00 15.000 810.00 25 2 backorder May
29 0.00 11.00 0.000 0.00 19 . f1_to_2 March
30 25520.00 88.00 290.000 25520.00 19 2 production March
31 0.00 17.00 0.000 0.00 19 2 backorder April
32 0.00 23.00 0.000 0.00 25 . f1_to_2 March
33 115570.01 182.00 645.000 117389.96 25 2 production March
34 0.00 31.00 0.000 0.00 25 2 backorder April
35 1840.00 16.00 100.000 1600.01 19 . f1_to_2 May
36 4508.00 133.80 35.000 4683.00 19 2 production May
37 0.00 20.00 15.000 299.99 19 2 storage April
38 8710.00 26.00 0.000 0.00 25 . f1_to_2 May
39 6349.00 201.40 35.000 7049.00 25 2 production May
40 0.00 38.00 0.000 0.00 25 2 storage April
41 -49147.54 -327.65 154.999 -50785.56 19 1 sales March
42 -75000.00 -300.00 250.000 -75000.00 19 1 sales April
43 -0.01 -285.00 0.000 0.00 19 1 sales May
44 -74350.00 -297.40 250.000 -74349.99 19 2 sales March
45 -72499.96 -290.00 245.001 -71050.17 19 2 sales April
Example 4.3: Adding Side Constraints ! 149
Production Planning/Inventory/Distribution
Minimum Cost Flow problem- Altered arc data
Obs _FCOST_ oldcost oldflow oldfc diagonal factory key_id mth_made
46 0.00 -292.00 0.000 0.00 19 2 sales May
47 0.00 -559.76 0.000 0.00 25 1 sales March
48 -0.01 -524.28 0.000 -0.01 25 1 sales April
49 -0.06 -475.02 25.000 -11875.64 25 1 sales May
50 -283915.00 -567.83 500.000 -283915.00 25 2 sales March
51 -216875.92 -542.19 375.000 -203321.08 25 2 sales April
52 0.00 -461.56 0.000 0.00 25 2 sales May
53 -90685.00 -362.74 250.000 -90685.00 19 1 sales March
54 -75000.00 -300.00 250.000 -75000.00 19 1 sales April
55 0.00 -245.00 0.000 0.00 19 1 sales May
56 -0.01 -272.70 0.000 0.00 19 2 sales March
57 -78000.00 -312.00 250.000 -78000.00 19 2 sales April
58 -44849.99 -299.00 150.000 -44850.00 19 2 sales May
59 -283869.94 -623.89 455.000 -283869.94 25 1 sales March
60 -129174.80 -549.68 535.000 -294078.78 25 1 sales April
61 0.00 -460.00 0.000 0.00 25 1 sales May
62 -59711.32 -542.83 120.000 -65139.47 25 2 sales March
63 -156573.27 -559.19 320.000 -178940.96 25 2 sales April
64 -192052.13 -489.06 20.000 -9781.20 25 2 sales May
=========== ===========
-1285086.44 -1281110.34
Example 4.3: Adding Side Constraints
The manufacturer of Gizmo chips, which are parts needed to make televisions, can supply only 2,600
chips to factory 1 and 3,750 chips to factory 2 in time for production in each of the months of March
and April. However, Gizmo chips will not be in short supply in May. Three chips are required to
make each 19-inch TV while the 25-inch TVs require four chips each. To limit the production of
televisions produced at factory 1 in March so that the TVs have the correct number of chips, a side
constraint called FACT1 MAR GIZMO is used. The form of this constraint is
3
*
prod f1 19 mar + 4
*
prod f1 25 mar <= 2600
prod f1 19 mar is the name of the arc directed from the node fact1_1 toward node f1_mar_1 and, in the
previous constraint, designates the ow assigned to this arc. The ARCDATA= and CONOUT= data
sets have arc names in a variable called _name_.
The other side constraints (shown below) are called FACT2 MAR GIZMO, FACT1 APL GIZMO, and
FACT2 APL GIZMO.
3
*
prod f2 19 mar + 4
*
prod f2 25 mar <= 3750
3
*
prod f1 19 apl + 4
*
prod f1 25 apl <= 2600
3
*
prod f2 19 apl + 4
*
prod f2 25 apl <= 3750
150 ! Chapter 4: The INTPOINT Procedure
To maintain customer goodwill, the total number of backorders is not to exceed 50 sets. The side
constraint TOTAL BACKORDER that models this restriction is
back f1 19 apl + back f1 25 apl +
back f2 19 apl + back f2 25 apl +
back f1 19 may + back f1 25 may +
back f2 19 may + back f2 25 may <= 50
The sparse CONDATA= data set format is used. All side constraints are of less than or equal
type. Because this is the default type value for the DEFCONTYPE= option, type information is
not necessary in the following CONDATA=con3. Also, DEFCONTYPE= <= does not have to
be specied in the PROC INTPOINT statement that follows. Notice that the _column_ variable
value CHIP/BO LIMIT indicates that an observation of the con3 data set contains rhs information.
Therefore, specify RHSOBS=CHIP/BO LIMIT
title2 'Adding Side Constraints';
data con3;
input _column_ &$14. _row_ &$15. _coef_ ;
datalines;
prod f1 19 mar FACT1 MAR GIZMO 3
prod f1 25 mar FACT1 MAR GIZMO 4
CHIP/BO LIMIT FACT1 MAR GIZMO 2600
prod f2 19 mar FACT2 MAR GIZMO 3
prod f2 25 mar FACT2 MAR GIZMO 4
CHIP/BO LIMIT FACT2 MAR GIZMO 3750
prod f1 19 apl FACT1 APL GIZMO 3
prod f1 25 apl FACT1 APL GIZMO 4
CHIP/BO LIMIT FACT1 APL GIZMO 2600
prod f2 19 apl FACT2 APL GIZMO 3
prod f2 25 apl FACT2 APL GIZMO 4
CHIP/BO LIMIT FACT2 APL GIZMO 3750
back f1 19 apl TOTAL BACKORDER 1
back f1 25 apl TOTAL BACKORDER 1
back f2 19 apl TOTAL BACKORDER 1
back f2 25 apl TOTAL BACKORDER 1
back f1 19 may TOTAL BACKORDER 1
back f1 25 may TOTAL BACKORDER 1
back f2 19 may TOTAL BACKORDER 1
back f2 25 may TOTAL BACKORDER 1
CHIP/BO LIMIT TOTAL BACKORDER 50
;
The four pairs of data sets that follow can be used as ARCDATA= and NODEDATA= data sets in the
following PROC INTPOINT run. The set used depends on which cost information the arcs are to
have.
ARCDATA=arc0 NODEDATA=node0
ARCDATA=arc1 NODEDATA=node0
ARCDATA=arc2 NODEDATA=node0
ARCDATA=arc3 NODEDATA=node0
Example 4.3: Adding Side Constraints ! 151
arc0, node0, and arc1 were created in Example 4.1. The rst two data sets are the original input data
sets.
In the previous example, arc2 was created by modifying arc1 to reect different arc costs. arc2 and
node0 can also be used as the ARCDATA= and NODEDATA= data sets in a PROC INTPOINT run.
If you are going to continue optimization using the changed arc costs, it is probably best to use arc3
and node0 as the ARCDATA= and NODEDATA= data sets.
PROC INTPOINT is used to nd the changed cost network solution that obeys the chip limit and
backorder side constraints. An explicit ID list has also been specied so that the variables oldcost,
oldfc, and oldow do not appear in the subsequent output data sets:
proc intpoint
bytes=1000000
printlevel2=2
nodedata=node0 arcdata=arc3
condata=con3 sparsecondata rhsobs='CHIP/BO LIMIT'
conout=arc4;
id diagonal factory key_id mth_made;
run;
proc print data=arc4;
var _tail_ _head_ _cost_ _capac_ _lo_ _flow_ _fcost_;
/
*
to get this variable order
*
/
sum _fcost_;
run;
The following messages appear on the SAS log:
152 ! Chapter 4: The INTPOINT Procedure
NOTE: The following variables in ARCDATA do not belong to any SAS variable list.
These will be ignored.
_FLOW_
_FCOST_
oldcost
oldfc
oldflow
NOTE: Number of nodes= 20 .
NOTE: Number of supply nodes= 4 .
NOTE: Number of demand nodes= 4 .
NOTE: Total supply= 4350 , total demand= 4150 .
NOTE: Number of arcs= 64 .
NOTE: Number of <= side constraints= 5 .
NOTE: Number of == side constraints= 0 .
NOTE: Number of >= side constraints= 0 .
NOTE: Number of side constraint coefficients= 16 .
NOTE: The following messages relate to the equivalent Linear Programming problem
solved by the Interior Point algorithm.
NOTE: Number of <= constraints= 5 .
NOTE: Number of == constraints= 21 .
NOTE: Number of >= constraints= 0 .
NOTE: Number of constraint coefficients= 152 .
NOTE: Number of variables= 68 .
NOTE: After preprocessing, number of <= constraints= 5.
NOTE: After preprocessing, number of == constraints= 20.
NOTE: After preprocessing, number of >= constraints= 0.
NOTE: The preprocessor eliminated 1 constraints from the problem.
NOTE: The preprocessor eliminated 9 constraint coefficients from the problem.
NOTE: 5 columns, 0 rows and 5 coefficients were added to the problem to handle
unrestricted variables, variables that are split, and constraint slack or
surplus variables.
NOTE: There are 74 sub-diagonal nonzeroes in the unfactored A Atranspose matrix.
NOTE: The 25 factor nodes make up 17 supernodes
NOTE: There are 88 nonzero sub-rows or sub-columns outside the supernodal triangular
regions along the factors leading diagonal.
Iter Complem_aff Complem-ity Duality_gap Tot_infeasb Tot_infeasc Tot_infeasd
0 -1.000000 199456613 0.894741 65408 35351 10906
1 38664128 25735020 0.919726 4738.839318 2561.195456 248.292591
2 5142982 1874540 0.595158 0 0 6.669426
3 366112 338310 0.207256 0 0 1.207816
4 172159 90907 0.063722 0 0 0.238703
5 48403 38889 0.027839 0 0 0.115586
6 28882 17979 0.013029 0 0 0.019825
7 7800.003324 3605.779203 0.002631 0 0 0.004077
8 1564.193112 422.251530 0.000309 0 0 0.000225
9 94.768595 16.589795 0.000012126 0 0 0
10 0.294833 0.001048 5.96523E-10 0 0 0
NOTE: The Primal-Dual Predictor-Corrector Interior Point algorithm performed 10
iterations.
NOTE: Optimum reached.
NOTE: Objective= -1282708.622.
NOTE: The data set WORK.ARC4 has 64 observations and 14 variables.
NOTE: There were 64 observations read from the data set WORK.ARC3.
NOTE: There were 8 observations read from the data set WORK.NODE0.
NOTE: There were 21 observations read from the data set WORK.CON3.
Example 4.3: Adding Side Constraints ! 153
Output 4.3.1 CONOUT=ARC4
Production Planning/Inventory/Distribution
Adding Side Constraints
Obs _tail_ _head_ _cost_ _capac_ _lo_ _FLOW_ _FCOST_
1 fact1_1 f1_apr_1 78.60 600 50 533.333 41920.00
2 f1_mar_1 f1_apr_1 15.00 50 0 0.000 0.00
3 f1_may_1 f1_apr_1 33.60 20 0 0.000 0.00
4 f2_apr_1 f1_apr_1 11.00 40 0 0.000 0.00
5 fact1_2 f1_apr_2 174.50 550 50 250.000 43625.00
6 f1_mar_2 f1_apr_2 20.00 40 0 0.000 0.00
7 f1_may_2 f1_apr_2 49.20 15 0 0.000 0.00
8 f2_apr_2 f1_apr_2 21.00 25 0 0.000 0.00
9 fact1_1 f1_mar_1 127.90 500 50 333.333 42633.33
10 f1_apr_1 f1_mar_1 33.60 20 0 20.000 672.00
11 f2_mar_1 f1_mar_1 10.00 40 0 40.000 400.00
12 fact1_2 f1_mar_2 217.90 400 40 400.000 87160.00
13 f1_apr_2 f1_mar_2 38.40 30 0 30.000 1152.00
14 f2_mar_2 f1_mar_2 20.00 25 0 25.000 500.00
15 fact1_1 f1_may_1 90.10 400 50 128.333 11562.83
16 f1_apr_1 f1_may_1 12.00 50 0 0.000 0.00
17 f2_may_1 f1_may_1 13.00 40 0 0.000 0.00
18 fact1_2 f1_may_2 113.30 350 40 350.000 39655.00
19 f1_apr_2 f1_may_2 18.00 40 0 0.000 0.00
20 f2_may_2 f1_may_2 13.00 25 0 0.000 0.00
21 f1_apr_1 f2_apr_1 11.00 99999999 0 13.333 146.67
22 fact2_1 f2_apr_1 62.40 480 35 480.000 29952.00
23 f2_mar_1 f2_apr_1 18.00 30 0 0.000 0.00
24 f2_may_1 f2_apr_1 30.00 15 0 0.000 0.00
25 f1_apr_2 f2_apr_2 23.00 99999999 0 0.000 0.00
26 fact2_2 f2_apr_2 196.70 680 35 577.500 113594.25
27 f2_mar_2 f2_apr_2 28.00 50 0 0.000 0.00
28 f2_may_2 f2_apr_2 64.80 15 0 0.000 0.00
29 f1_mar_1 f2_mar_1 11.00 99999999 0 0.000 0.00
30 fact2_1 f2_mar_1 88.00 450 35 290.000 25520.00
31 f2_apr_1 f2_mar_1 20.40 15 0 0.000 0.00
32 f1_mar_2 f2_mar_2 23.00 99999999 0 0.000 0.00
33 fact2_2 f2_mar_2 182.00 650 35 650.000 118300.00
34 f2_apr_2 f2_mar_2 37.20 15 0 0.000 0.00
35 f1_may_1 f2_may_1 16.00 99999999 0 115.000 1840.00
36 fact2_1 f2_may_1 128.80 250 35 35.000 4508.00
37 f2_apr_1 f2_may_1 20.00 30 0 0.000 0.00
38 f1_may_2 f2_may_2 26.00 99999999 0 350.000 9100.00
39 fact2_2 f2_may_2 181.40 550 35 122.500 22221.50
40 f2_apr_2 f2_may_2 38.00 50 0 0.000 0.00
41 f1_mar_1 shop1_1 -327.65 250 0 143.333 -46963.16
42 f1_apr_1 shop1_1 -300.00 250 0 250.000 -75000.00
43 f1_may_1 shop1_1 -285.00 250 0 13.333 -3800.00
44 f2_mar_1 shop1_1 -297.40 250 0 250.000 -74350.00
45 f2_apr_1 shop1_1 -290.00 250 0 243.333 -70566.67
154 ! Chapter 4: The INTPOINT Procedure
Output 4.3.1 continued
Production Planning/Inventory/Distribution
Adding Side Constraints
Obs _tail_ _head_ _cost_ _capac_ _lo_ _FLOW_ _FCOST_
46 f2_may_1 shop1_1 -292.00 250 0 0.000 0.00
47 f1_mar_2 shop1_2 -559.76 99999999 0 0.000 0.00
48 f1_apr_2 shop1_2 -524.28 99999999 0 0.000 0.00
49 f1_may_2 shop1_2 -475.02 99999999 0 0.000 0.00
50 f2_mar_2 shop1_2 -567.83 500 0 500.000 -283915.00
51 f2_apr_2 shop1_2 -542.19 500 0 400.000 -216876.00
52 f2_may_2 shop1_2 -491.56 500 0 0.000 0.00
53 f1_mar_1 shop2_1 -362.74 250 0 250.000 -90685.00
54 f1_apr_1 shop2_1 -300.00 250 0 250.000 -75000.00
55 f1_may_1 shop2_1 -245.00 250 0 0.000 0.00
56 f2_mar_1 shop2_1 -272.70 250 0 0.000 0.00
57 f2_apr_1 shop2_1 -312.00 250 0 250.000 -78000.00
58 f2_may_1 shop2_1 -299.00 250 0 150.000 -44850.00
59 f1_mar_2 shop2_2 -623.89 99999999 0 455.000 -283869.95
60 f1_apr_2 shop2_2 -549.68 99999999 0 220.000 -120929.60
61 f1_may_2 shop2_2 -460.00 99999999 0 0.000 0.00
62 f2_mar_2 shop2_2 -542.83 500 0 125.000 -67853.75
63 f2_apr_2 shop2_2 -559.19 500 0 177.500 -99256.23
64 f2_may_2 shop2_2 -519.06 500 0 472.500 -245255.85
===========
-1282708.62
Example 4.4: Using Constraints and More Alteration to Arc Data
Suppose the 25-inch screen TVs produced at factory 1 in May can be sold at either shop with an
increased prot of 40 dollars each. What is the new optimal solution?
title2 'Using Constraints and Altering arc data';
data new_arc4;
set arc4;
oldcost=_cost_;
oldflow=_flow_;
oldfc=_fcost_;
if _tail_='f1_may_2' & (_head_='shop1_2' | _head_='shop2_2')
then _cost_=_cost_-40;
run;
proc intpoint
bytes=1000000
printlevel2=2
arcdata=new_arc4 nodedata=node0
condata=con3 sparsecondata rhsobs='CHIP/BO LIMIT'
conout=arc5;
run;
Example 4.4: Using Constraints and More Alteration to Arc Data ! 155
title2 'Using Constraints and Altering arc data';
proc print data=arc5;
var _tail_ _head_ _cost_ _capac_ _lo_
_supply_ _demand_ _name_ _flow_ _fcost_ oldflow oldfc;
/
*
to get this variable order
*
/
sum oldfc _fcost_;
run;
The following messages appear on the SAS log:
NOTE: Number of nodes= 20 .
NOTE: Number of supply nodes= 4 .
NOTE: Number of demand nodes= 4 .
NOTE: Total supply= 4350 , total demand= 4150 .
NOTE: Number of arcs= 64 .
NOTE: Number of <= side constraints= 5 .
NOTE: Number of == side constraints= 0 .
NOTE: Number of >= side constraints= 0 .
NOTE: Number of side constraint coefficients= 16 .
NOTE: The following messages relate to the equivalent Linear Programming problem
solved by the Interior Point algorithm.
NOTE: Number of <= constraints= 5 .
NOTE: Number of == constraints= 21 .
NOTE: Number of >= constraints= 0 .
NOTE: Number of constraint coefficients= 152 .
NOTE: Number of variables= 68 .
NOTE: After preprocessing, number of <= constraints= 5.
NOTE: After preprocessing, number of == constraints= 20.
NOTE: After preprocessing, number of >= constraints= 0.
NOTE: The preprocessor eliminated 1 constraints from the problem.
NOTE: The preprocessor eliminated 9 constraint coefficients from the problem.
NOTE: 5 columns, 0 rows and 5 coefficients were added to the problem to handle
unrestricted variables, variables that are split, and constraint slack or
surplus variables.
NOTE: There are 74 sub-diagonal nonzeroes in the unfactored A Atranspose matrix.
NOTE: The 25 factor nodes make up 17 supernodes
NOTE: There are 88 nonzero sub-rows or sub-columns outside the supernodal triangular
regions along the factors leading diagonal.
Iter Complem_aff Complem-ity Duality_gap Tot_infeasb Tot_infeasc Tot_infeasd
0 -1.000000 201073760 0.894528 65408 35351 10995
1 39022799 25967436 0.919693 4741.966761 2562.885742 256.192394
2 5186078 1844990 0.589523 0 0 6.174556
3 371920 320310 0.197224 0 0 1.074616
4 151369 87643 0.060906 0 0 0.267952
5 35115 25158 0.018017 0 0 0.072961
6 14667 6194.354873 0.004475 0 0 0.005048
7 2723.955063 2472.352937 0.001789 0 0 0.001714
8 1028.390365 280.346187 0.000203 0 0 0.000235
9 39.957867 5.611483 0.000004063 0 0 0
10 0.014117 0.000291 9.492733E-11 0 0 0
156 ! Chapter 4: The INTPOINT Procedure
NOTE: The Primal-Dual Predictor-Corrector Interior Point algorithm performed 10
iterations.
NOTE: Optimum reached.
NOTE: Objective= -1295661.8.
NOTE: The data set WORK.ARC5 has 64 observations and 17 variables.
NOTE: There were 64 observations read from the data set WORK.NEW_ARC4.
NOTE: There were 8 observations read from the data set WORK.NODE0.
NOTE: There were 21 observations read from the data set WORK.CON3.
Output 4.4.1 CONOUT=ARC5
Production Planning/Inventory/Distribution
Using Constraints and Altering arc data
Obs _tail_ _head_ _cost_ _capac_ _lo_ _SUPPLY_ _DEMAND_
1 fact1_1 f1_apr_1 78.6 600 50 1000 .
2 f1_mar_1 f1_apr_1 15.0 50 0 . .
3 f1_may_1 f1_apr_1 33.6 20 0 . .
4 f2_apr_1 f1_apr_1 11.0 40 0 . .
5 fact1_2 f1_apr_2 174.5 550 50 1000 .
6 f1_mar_2 f1_apr_2 20.0 40 0 . .
7 f1_may_2 f1_apr_2 49.2 15 0 . .
8 f2_apr_2 f1_apr_2 21.0 25 0 . .
9 fact1_1 f1_mar_1 127.9 500 50 1000 .
10 f1_apr_1 f1_mar_1 33.6 20 0 . .
11 f2_mar_1 f1_mar_1 10.0 40 0 . .
12 fact1_2 f1_mar_2 217.9 400 40 1000 .
13 f1_apr_2 f1_mar_2 38.4 30 0 . .
14 f2_mar_2 f1_mar_2 20.0 25 0 . .
15 fact1_1 f1_may_1 90.1 400 50 1000 .
16 f1_apr_1 f1_may_1 12.0 50 0 . .
17 f2_may_1 f1_may_1 13.0 40 0 . .
18 fact1_2 f1_may_2 113.3 350 40 1000 .
19 f1_apr_2 f1_may_2 18.0 40 0 . .
20 f2_may_2 f1_may_2 13.0 25 0 . .
21 f1_apr_1 f2_apr_1 11.0 99999999 0 . .
22 fact2_1 f2_apr_1 62.4 480 35 850 .
23 f2_mar_1 f2_apr_1 18.0 30 0 . .
24 f2_may_1 f2_apr_1 30.0 15 0 . .
25 f1_apr_2 f2_apr_2 23.0 99999999 0 . .
26 fact2_2 f2_apr_2 196.7 680 35 1500 .
27 f2_mar_2 f2_apr_2 28.0 50 0 . .
28 f2_may_2 f2_apr_2 64.8 15 0 . .
29 f1_mar_1 f2_mar_1 11.0 99999999 0 . .
30 fact2_1 f2_mar_1 88.0 450 35 850 .
31 f2_apr_1 f2_mar_1 20.4 15 0 . .
32 f1_mar_2 f2_mar_2 23.0 99999999 0 . .
33 fact2_2 f2_mar_2 182.0 650 35 1500 .
34 f2_apr_2 f2_mar_2 37.2 15 0 . .
35 f1_may_1 f2_may_1 16.0 99999999 0 . .
Example 4.4: Using Constraints and More Alteration to Arc Data ! 157
Output 4.4.1 continued
Production Planning/Inventory/Distribution
Using Constraints and Altering arc data
Obs _tail_ _head_ _cost_ _capac_ _lo_ _SUPPLY_ _DEMAND_
36 fact2_1 f2_may_1 128.80 250 35 850 .
37 f2_apr_1 f2_may_1 20.00 30 0 . .
38 f1_may_2 f2_may_2 26.00 99999999 0 . .
39 fact2_2 f2_may_2 181.40 550 35 1500 .
40 f2_apr_2 f2_may_2 38.00 50 0 . .
41 f1_mar_1 shop1_1 -327.65 250 0 . 900
42 f1_apr_1 shop1_1 -300.00 250 0 . 900
43 f1_may_1 shop1_1 -285.00 250 0 . 900
44 f2_mar_1 shop1_1 -297.40 250 0 . 900
45 f2_apr_1 shop1_1 -290.00 250 0 . 900
46 f2_may_1 shop1_1 -292.00 250 0 . 900
47 f1_mar_2 shop1_2 -559.76 99999999 0 . 900
48 f1_apr_2 shop1_2 -524.28 99999999 0 . 900
49 f1_may_2 shop1_2 -515.02 99999999 0 . 900
50 f2_mar_2 shop1_2 -567.83 500 0 . 900
51 f2_apr_2 shop1_2 -542.19 500 0 . 900
52 f2_may_2 shop1_2 -491.56 500 0 . 900
53 f1_mar_1 shop2_1 -362.74 250 0 . 900
54 f1_apr_1 shop2_1 -300.00 250 0 . 900
55 f1_may_1 shop2_1 -245.00 250 0 . 900
56 f2_mar_1 shop2_1 -272.70 250 0 . 900
57 f2_apr_1 shop2_1 -312.00 250 0 . 900
58 f2_may_1 shop2_1 -299.00 250 0 . 900
59 f1_mar_2 shop2_2 -623.89 99999999 0 . 1450
60 f1_apr_2 shop2_2 -549.68 99999999 0 . 1450
61 f1_may_2 shop2_2 -500.00 99999999 0 . 1450
62 f2_mar_2 shop2_2 -542.83 500 0 . 1450
63 f2_apr_2 shop2_2 -559.19 500 0 . 1450
64 f2_may_2 shop2_2 -519.06 500 0 . 1450
158 ! Chapter 4: The INTPOINT Procedure
Production Planning/Inventory/Distribution
Using Constraints and Altering arc data
Obs _name_ _FLOW_ _FCOST_ oldflow oldfc
1 prod f1 19 apl 533.333 41920.00 533.333 41920.00
2 0.000 0.00 0.000 0.00
3 back f1 19 may 0.000 0.00 0.000 0.00
4 0.000 0.00 0.000 0.00
5 prod f1 25 apl 250.000 43625.00 250.000 43625.00
6 0.000 0.00 0.000 0.00
7 back f1 25 may 0.000 0.00 0.000 0.00
8 0.000 0.00 0.000 0.00
9 prod f1 19 mar 333.333 42633.33 333.333 42633.33
10 back f1 19 apl 20.000 672.00 20.000 672.00
11 40.000 400.00 40.000 400.00
12 prod f1 25 mar 400.000 87160.00 400.000 87160.00
13 back f1 25 apl 30.000 1152.00 30.000 1152.00
14 25.000 500.00 25.000 500.00
15 128.333 11562.83 128.333 11562.83
16 0.000 0.00 0.000 0.00
17 0.000 0.00 0.000 0.00
18 350.000 39655.00 350.000 39655.00
19 0.000 0.00 0.000 0.00
20 0.000 0.00 0.000 0.00
21 13.333 146.67 13.333 146.67
22 prod f2 19 apl 480.000 29952.00 480.000 29952.00
23 0.000 0.00 0.000 0.00
24 back f2 19 may 0.000 0.00 0.000 0.00
25 0.000 0.00 0.000 0.00
26 prod f2 25 apl 550.000 108185.00 577.500 113594.25
27 0.000 0.00 0.000 0.00
28 back f2 25 may 0.000 0.00 0.000 0.00
29 0.000 0.00 0.000 0.00
30 prod f2 19 mar 290.000 25520.00 290.000 25520.00
31 back f2 19 apl 0.000 0.00 0.000 0.00
32 0.000 0.00 0.000 0.00
33 prod f2 25 mar 650.000 118300.00 650.000 118300.00
34 back f2 25 apl 0.000 0.00 0.000 0.00
35 115.000 1840.00 115.000 1840.00
36 35.000 4508.00 35.000 4508.00
37 0.000 0.00 0.000 0.00
38 0.000 0.00 350.000 9100.00
39 150.000 27210.00 122.500 22221.50
40 0.000 0.00 0.000 0.00
41 143.333 -46963.17 143.333 -46963.16
42 250.000 -75000.00 250.000 -75000.00
43 13.333 -3800.00 13.333 -3800.00
44 250.000 -74350.00 250.000 -74350.00
45 243.333 -70566.67 243.333 -70566.67
Example 4.5: Nonarc Variables in the Side Constraints ! 159
Production Planning/Inventory/Distribution
Using Constraints and Altering arc data
Obs _name_ _FLOW_ _FCOST_ oldflow oldfc
46 0.000 0.00 0.000 0.00
47 0.000 0.00 0.000 0.00
48 0.000 0.00 0.000 0.00
49 350.000 -180257.00 0.000 0.00
50 500.000 -283915.00 500.000 -283915.00
51 50.000 -27109.50 400.000 -216876.00
52 0.000 0.00 0.000 0.00
53 250.000 -90685.00 250.000 -90685.00
54 250.000 -75000.00 250.000 -75000.00
55 0.000 0.00 0.000 0.00
56 0.000 0.00 0.000 0.00
57 250.000 -78000.00 250.000 -78000.00
58 150.000 -44850.00 150.000 -44850.00
59 455.000 -283869.95 455.000 -283869.95
60 220.000 -120929.60 220.000 -120929.60
61 0.000 0.00 0.000 0.00
62 125.000 -67853.75 125.000 -67853.75
63 500.000 -279595.00 177.500 -99256.23
64 150.000 -77859.00 472.500 -245255.85
=========== ===========
-1295661.80 -1282708.62
Example 4.5: Nonarc Variables in the Side Constraints
You can verify that the FACT2 MAR GIZMO constraint has a left-hand-side activity of 3,470, which is
not equal to the _RHS_ of this constraint. Not all of the 3,750 chips that can be supplied to factory
2 for March production are used. It is suggested that all the possible chips be obtained in March
and those not used be saved for April production. Because chips must be kept in an air-controlled
environment, it costs one dollar to store each chip purchased in March until April. The maximum
number of chips that can be stored in this environment at each factory is 150. In addition, a search of
the parts inventory at factory 1 turned up 15 chips available for their March production.
Nonarc variables are used in the side constraints that handle the limitations of supply of Gizmo
chips. A nonarc variable called f1 unused chips has as a value the number of chips that are not used
at factory 1 in March. Another nonarc variable, f2 unused chips, has as a value the number of chips
that are not used at factory 2 in March. f1 chips from mar has as a value the number of chips left over
from March used for production at factory 1 in April. Similarly, f2 chips from mar has as a value the
number of chips left over from March used for April production at factory 2 in April. The last two
nonarc variables have objective function coefcients of 1 and upper bounds of 150. The Gizmo side
constraints are
3
*
prod f1 19 mar + 4
*
prod f1 25 mar + f1 unused chips = 2615
3
*
prod f2 19 apl + 4
*
prod f2 25 apl + f2 unused chips = 3750
160 ! Chapter 4: The INTPOINT Procedure
3
*
prod f1 19 apl + 4
*
prod f1 25 apl - f1 chips from mar = 2600
3
*
prod f2 19 apl + 4
*
prod f2 25 apl - f2 chips from mar = 3750
f1 unused chips + f2 unused chips -
f1 chips from mar - f2 chips from mar >= 0
The last side constraint states that the number of chips not used in March is not less than the number
of chips left over from March and used in April. Here, this constraint is called CHIP LEFTOVER.
The following SAS code creates a new data set containing constraint data. It seems that most of
the constraints are now equalities, so you specify DEFCONTYPE=EQ in the PROC INTPOINT
statement from now on and provide constraint type data for constraints that are not equal to type,
using the default TYPEOBS value _TYPE_ as the _COLUMN_ variable value to indicate observations
that contain constraint type data. Also, from now on, the default RHSOBS value is used.
title2 'Nonarc Variables in the Side Constraints';
data con6;
input _column_ &$17. _row_ &$15. _coef_ ;
datalines;
prod f1 19 mar FACT1 MAR GIZMO 3
prod f1 25 mar FACT1 MAR GIZMO 4
f1 unused chips FACT1 MAR GIZMO 1
_RHS_ FACT1 MAR GIZMO 2615
prod f2 19 mar FACT2 MAR GIZMO 3
prod f2 25 mar FACT2 MAR GIZMO 4
f2 unused chips FACT2 MAR GIZMO 1
_RHS_ FACT2 MAR GIZMO 3750
prod f1 19 apl FACT1 APL GIZMO 3
prod f1 25 apl FACT1 APL GIZMO 4
f1 chips from mar FACT1 APL GIZMO -1
_RHS_ FACT1 APL GIZMO 2600
prod f2 19 apl FACT2 APL GIZMO 3
prod f2 25 apl FACT2 APL GIZMO 4
f2 chips from mar FACT2 APL GIZMO -1
_RHS_ FACT2 APL GIZMO 3750
f1 unused chips CHIP LEFTOVER 1
f2 unused chips CHIP LEFTOVER 1
f1 chips from mar CHIP LEFTOVER -1
f2 chips from mar CHIP LEFTOVER -1
_TYPE_ CHIP LEFTOVER 1
back f1 19 apl TOTAL BACKORDER 1
back f1 25 apl TOTAL BACKORDER 1
back f2 19 apl TOTAL BACKORDER 1
back f2 25 apl TOTAL BACKORDER 1
back f1 19 may TOTAL BACKORDER 1
back f1 25 may TOTAL BACKORDER 1
back f2 19 may TOTAL BACKORDER 1
back f2 25 may TOTAL BACKORDER 1
_TYPE_ TOTAL BACKORDER -1
_RHS_ TOTAL BACKORDER 50
;
The nonarc variables f1 chips from mar and f2 chips from mar have objective function coefcients of
1 and upper bounds of 150. There are various ways in which this information can be furnished to
Example 4.5: Nonarc Variables in the Side Constraints ! 161
PROC INTPOINT. If there were a TYPE list variable in the CONDATA= data set, observations could
be in the form
_COLUMN_ _TYPE_ _ROW_ _COEF_
f1 chips from mar objfn . 1
f1 chips from mar upperbd . 150
f2 chips from mar objfn . 1
f2 chips from mar upperbd . 150
It is desirable to assign ID list variable values to all the nonarc variables:
data arc6;
input _tail_ $ _head_ $ _cost_ _capac_ _lo_ diagonal factory
key_id $10. mth_made $ _name_&$17.;
datalines;
fact1_1 f1_apr_1 78.60 600 50 19 1 production April prod f1 19 apl
f1_mar_1 f1_apr_1 15.00 50 . 19 1 storage March .
f1_may_1 f1_apr_1 33.60 20 . 19 1 backorder May back f1 19 may
f2_apr_1 f1_apr_1 11.00 40 . 19 . f2_to_1 April .
fact1_2 f1_apr_2 174.50 550 50 25 1 production April prod f1 25 apl
f1_mar_2 f1_apr_2 20.00 40 . 25 1 storage March .
f1_may_2 f1_apr_2 49.20 15 . 25 1 backorder May back f1 25 may
f2_apr_2 f1_apr_2 21.00 25 . 25 . f2_to_1 April .
fact1_1 f1_mar_1 127.90 500 50 19 1 production March prod f1 19 mar
f1_apr_1 f1_mar_1 33.60 20 . 19 1 backorder April back f1 19 apl
f2_mar_1 f1_mar_1 10.00 40 . 19 . f2_to_1 March .
fact1_2 f1_mar_2 217.90 400 40 25 1 production March prod f1 25 mar
f1_apr_2 f1_mar_2 38.40 30 . 25 1 backorder April back f1 25 apl
f2_mar_2 f1_mar_2 20.00 25 . 25 . f2_to_1 March .
fact1_1 f1_may_1 90.10 400 50 19 1 production May .
f1_apr_1 f1_may_1 12.00 50 . 19 1 storage April .
f2_may_1 f1_may_1 13.00 40 . 19 . f2_to_1 May .
fact1_2 f1_may_2 113.30 350 40 25 1 production May .
f1_apr_2 f1_may_2 18.00 40 . 25 1 storage April .
f2_may_2 f1_may_2 13.00 25 . 25 . f2_to_1 May .
f1_apr_1 f2_apr_1 11.00 . . 19 . f1_to_2 April .
fact2_1 f2_apr_1 62.40 480 35 19 2 production April prod f2 19 apl
f2_mar_1 f2_apr_1 18.00 30 . 19 2 storage March .
f2_may_1 f2_apr_1 30.00 15 . 19 2 backorder May back f2 19 may
f1_apr_2 f2_apr_2 23.00 . . 25 . f1_to_2 April .
fact2_2 f2_apr_2 196.70 680 35 25 2 production April prod f2 25 apl
f2_mar_2 f2_apr_2 28.00 50 . 25 2 storage March .
f2_may_2 f2_apr_2 64.80 15 . 25 2 backorder May back f2 25 may
f1_mar_1 f2_mar_1 11.00 . . 19 . f1_to_2 March .
fact2_1 f2_mar_1 88.00 450 35 19 2 production March prod f2 19 mar
f2_apr_1 f2_mar_1 20.40 15 . 19 2 backorder April back f2 19 apl
f1_mar_2 f2_mar_2 23.00 . . 25 . f1_to_2 March .
fact2_2 f2_mar_2 182.00 650 35 25 2 production March prod f2 25 mar
f2_apr_2 f2_mar_2 37.20 15 . 25 2 backorder April back f2 25 apl
f1_may_1 f2_may_1 16.00 . . 19 . f1_to_2 May .
fact2_1 f2_may_1 128.80 250 35 19 2 production May .
f2_apr_1 f2_may_1 20.00 30 . 19 2 storage April .
f1_may_2 f2_may_2 26.00 . . 25 . f1_to_2 May .
162 ! Chapter 4: The INTPOINT Procedure
fact2_2 f2_may_2 181.40 550 35 25 2 production May .
f2_apr_2 f2_may_2 38.00 50 . 25 2 storage April .
f1_mar_1 shop1_1 -327.65 250 . 19 1 sales March .
f1_apr_1 shop1_1 -300.00 250 . 19 1 sales April .
f1_may_1 shop1_1 -285.00 250 . 19 1 sales May .
f2_mar_1 shop1_1 -297.40 250 . 19 2 sales March .
f2_apr_1 shop1_1 -290.00 250 . 19 2 sales April .
f2_may_1 shop1_1 -292.00 250 . 19 2 sales May .
f1_mar_2 shop1_2 -559.76 . . 25 1 sales March .
f1_apr_2 shop1_2 -524.28 . . 25 1 sales April .
f1_may_2 shop1_2 -515.02 . . 25 1 sales May .
f2_mar_2 shop1_2 -567.83 500 . 25 2 sales March .
f2_apr_2 shop1_2 -542.19 500 . 25 2 sales April .
f2_may_2 shop1_2 -491.56 500 . 25 2 sales May .
f1_mar_1 shop2_1 -362.74 250 . 19 1 sales March .
f1_apr_1 shop2_1 -300.00 250 . 19 1 sales April .
f1_may_1 shop2_1 -245.00 250 . 19 1 sales May .
f2_mar_1 shop2_1 -272.70 250 . 19 2 sales March .
f2_apr_1 shop2_1 -312.00 250 . 19 2 sales April .
f2_may_1 shop2_1 -299.00 250 . 19 2 sales May .
f1_mar_2 shop2_2 -623.89 . . 25 1 sales March .
f1_apr_2 shop2_2 -549.68 . . 25 1 sales April .
f1_may_2 shop2_2 -500.00 . . 25 1 sales May .
f2_mar_2 shop2_2 -542.83 500 . 25 2 sales March .
f2_apr_2 shop2_2 -559.19 500 . 25 2 sales April .
f2_may_2 shop2_2 -519.06 500 . 25 2 sales May .
;
data arc6;
set arc5;
drop oldcost oldfc oldflow _flow_ _fcost_ ;
run;
data arc6_b;
input _name_ &$17. _cost_ _capac_ factory key_id $ ;
datalines;
f1 unused chips . . 1 chips
f2 unused chips . . 2 chips
f1 chips from mar 1 150 1 chips
f2 chips from mar 1 150 2 chips
;
proc append force
base=arc6 data=arc6_b;
run;
proc intpoint
bytes=1000000
printlevel2=2
nodedata=node0 arcdata=arc6
condata=con6 defcontype=eq sparsecondata
conout=arc7;
run;
The following messages appear on the SAS log:
Example 4.5: Nonarc Variables in the Side Constraints ! 163
NOTE: Number of nodes= 20 .
NOTE: Number of supply nodes= 4 .
NOTE: Number of demand nodes= 4 .
NOTE: Total supply= 4350 , total demand= 4150 .
NOTE: Number of arcs= 64 .
NOTE: Number of nonarc variables= 4 .
NOTE: Number of <= side constraints= 1 .
NOTE: Number of == side constraints= 4 .
NOTE: Number of >= side constraints= 1 .
NOTE: Number of side constraint coefficients= 24 .
NOTE: The following messages relate to the equivalent Linear Programming problem
solved by the Interior Point algorithm.
NOTE: Number of <= constraints= 1 .
NOTE: Number of == constraints= 25 .
NOTE: Number of >= constraints= 1 .
NOTE: Number of constraint coefficients= 160 .
NOTE: Number of variables= 72 .
NOTE: After preprocessing, number of <= constraints= 1.
NOTE: After preprocessing, number of == constraints= 24.
NOTE: After preprocessing, number of >= constraints= 1.
NOTE: The preprocessor eliminated 1 constraints from the problem.
NOTE: The preprocessor eliminated 9 constraint coefficients from the problem.
NOTE: 2 columns, 0 rows and 2 coefficients were added to the problem to handle
unrestricted variables, variables that are split, and constraint slack or
surplus variables.
NOTE: There are 78 sub-diagonal nonzeroes in the unfactored A Atranspose matrix.
NOTE: The 26 factor nodes make up 18 supernodes
NOTE: There are 101 nonzero sub-rows or sub-columns outside the supernodal triangular
regions along the factors leading diagonal.
Iter Complem_aff Complem-ity Duality_gap Tot_infeasb Tot_infeasc Tot_infeasd
0 -1.000000 210688061 0.904882 69336 35199 4398.024971
1 54066756 35459986 0.931873 5967.706945 3029.541352 935.225890
2 10266927 2957978 0.671565 0 0 36.655485
3 326659 314818 0.177750 0 0 3.893178
4 137432 83570 0.053111 0 0 0.852994
5 41386 26985 0.017545 0 0 0.204166
6 12451 6063.528974 0.003973 0 0 0.041229
7 2962.309960 1429.369437 0.000939 0 0 0.004395
8 352.469864 233.620884 0.000153 0 0 0.000297
9 115.012309 23.329492 0.000015331 0 0 0
10 1.754859 0.039304 2.5828261E-8 0 0 0
NOTE: The Primal-Dual Predictor-Corrector Interior Point algorithm performed 10
iterations.
NOTE: Optimum reached.
NOTE: Objective= -1295542.717.
NOTE: The data set WORK.ARC7 has 68 observations and 14 variables.
NOTE: There were 68 observations read from the data set WORK.ARC6.
NOTE: There were 8 observations read from the data set WORK.NODE0.
NOTE: There were 31 observations read from the data set WORK.CON6.
164 ! Chapter 4: The INTPOINT Procedure
The optimal solution data set, CONOUT=ARC7, is given in Output 4.5.1.
proc print data=arc7;
var _tail_ _head_ _name_ _cost_ _capac_ _lo_
_flow_ _fcost_;
sum _fcost_;
run;
The optimal value of the nonarc variable f2 unused chips is 280. This means that although there are
3,750 chips that can be used at factory 2 in March, only 3,470 are used. As the optimal value of f1
unused chips is zero, all chips available for production in March at factory 1 are used. The nonarc
variable f2 chips from mar also has zero optimal value. This means that the April production at factory
2 does not need any chips that could have been held in inventory since March. However, the nonarc
variable f1 chips from mar has value of 20. Thus, 3,490 chips should be ordered for factory 2 in March.
Twenty of these chips should be held in inventory until April, then sent to factory 1.
Output 4.5.1 CONOUT=ARC7
Production Planning/Inventory/Distribution
Nonarc Variables in the Side Constraints
Obs _tail_ _head_ _name_ _cost_ _capac_ _lo_ _FLOW_ _FCOST_
1 fact1_1 f1_apr_1 prod f1 19 apl 78.60 600 50 540.000 42444.00
2 f1_mar_1 f1_apr_1 15.00 50 0 0.000 0.00
3 f1_may_1 f1_apr_1 back f1 19 may 33.60 20 0 0.000 0.00
4 f2_apr_1 f1_apr_1 11.00 40 0 0.000 0.00
5 fact1_2 f1_apr_2 prod f1 25 apl 174.50 550 50 250.000 43625.01
6 f1_mar_2 f1_apr_2 20.00 40 0 0.000 0.00
7 f1_may_2 f1_apr_2 back f1 25 may 49.20 15 0 0.000 0.00
8 f2_apr_2 f1_apr_2 21.00 25 0 25.000 525.00
9 fact1_1 f1_mar_1 prod f1 19 mar 127.90 500 50 338.333 43272.81
10 f1_apr_1 f1_mar_1 back f1 19 apl 33.60 20 0 20.000 672.00
11 f2_mar_1 f1_mar_1 10.00 40 0 40.000 400.00
12 fact1_2 f1_mar_2 prod f1 25 mar 217.90 400 40 400.000 87159.99
13 f1_apr_2 f1_mar_2 back f1 25 apl 38.40 30 0 30.000 1152.00
14 f2_mar_2 f1_mar_2 20.00 25 0 25.000 500.00
15 fact1_1 f1_may_1 90.10 400 50 116.667 10511.68
16 f1_apr_1 f1_may_1 12.00 50 0 0.000 0.00
17 f2_may_1 f1_may_1 13.00 40 0 0.000 0.00
18 fact1_2 f1_may_2 113.30 350 40 350.000 39655.00
19 f1_apr_2 f1_may_2 18.00 40 0 0.000 0.00
20 f2_may_2 f1_may_2 13.00 25 0 0.000 0.00
21 f1_apr_1 f2_apr_1 11.00 99999999 0 20.000 220.00
22 fact2_1 f2_apr_1 prod f2 19 apl 62.40 480 35 480.000 29952.00
23 f2_mar_1 f2_apr_1 18.00 30 0 0.000 0.00
24 f2_may_1 f2_apr_1 back f2 19 may 30.00 15 0 0.000 0.00
25 f1_apr_2 f2_apr_2 23.00 99999999 0 0.000 0.00
26 fact2_2 f2_apr_2 prod f2 25 apl 196.70 680 35 577.500 113594.25
27 f2_mar_2 f2_apr_2 28.00 50 0 0.000 0.00
28 f2_may_2 f2_apr_2 back f2 25 may 64.80 15 0 0.000 0.00
29 f1_mar_1 f2_mar_1 11.00 99999999 0 0.000 0.00
30 fact2_1 f2_mar_1 prod f2 19 mar 88.00 450 35 290.000 25520.00
Example 4.5: Nonarc Variables in the Side Constraints ! 165
Output 4.5.1 continued
Production Planning/Inventory/Distribution
Nonarc Variables in the Side Constraints
Obs _tail_ _head_ _name_ _cost_ _capac_ _lo_ _FLOW_ _FCOST_
31 f2_apr_1 f2_mar_1 back f2 19 apl 20.40 15 0 0.000 0.00
32 f1_mar_2 f2_mar_2 23.00 99999999 0 0.000 0.00
33 fact2_2 f2_mar_2 prod f2 25 mar 182.00 650 35 650.000 118300.00
34 f2_apr_2 f2_mar_2 back f2 25 apl 37.20 15 0 0.000 0.00
35 f1_may_1 f2_may_1 16.00 99999999 0 115.000 1840.00
36 fact2_1 f2_may_1 128.80 250 35 35.000 4508.00
37 f2_apr_1 f2_may_1 20.00 30 0 0.000 0.00
38 f1_may_2 f2_may_2 26.00 99999999 0 0.000 0.00
39 fact2_2 f2_may_2 181.40 550 35 122.500 22221.50
40 f2_apr_2 f2_may_2 38.00 50 0 0.000 0.00
41 f1_mar_1 shop1_1 -327.65 250 0 148.333 -48601.35
42 f1_apr_1 shop1_1 -300.00 250 0 250.000 -75000.00
43 f1_may_1 shop1_1 -285.00 250 0 1.667 -475.01
44 f2_mar_1 shop1_1 -297.40 250 0 250.000 -74350.00
45 f2_apr_1 shop1_1 -290.00 250 0 250.000 -72500.00
46 f2_may_1 shop1_1 -292.00 250 0 0.000 -0.05
47 f1_mar_2 shop1_2 -559.76 99999999 0 0.000 0.00
48 f1_apr_2 shop1_2 -524.28 99999999 0 0.000 0.00
49 f1_may_2 shop1_2 -515.02 99999999 0 347.500 -178969.34
50 f2_mar_2 shop1_2 -567.83 500 0 500.000 -283914.98
51 f2_apr_2 shop1_2 -542.19 500 0 52.500 -28465.09
52 f2_may_2 shop1_2 -491.56 500 0 0.000 0.00
53 f1_mar_1 shop2_1 -362.74 250 0 250.000 -90684.99
54 f1_apr_1 shop2_1 -300.00 250 0 250.000 -75000.00
55 f1_may_1 shop2_1 -245.00 250 0 0.000 -0.00
56 f2_mar_1 shop2_1 -272.70 250 0 0.000 -0.01
57 f2_apr_1 shop2_1 -312.00 250 0 250.000 -78000.00
58 f2_may_1 shop2_1 -299.00 250 0 150.000 -44850.00
59 f1_mar_2 shop2_2 -623.89 99999999 0 455.000 -283869.90
60 f1_apr_2 shop2_2 -549.68 99999999 0 245.000 -134671.54
61 f1_may_2 shop2_2 -500.00 99999999 0 2.500 -1250.00
62 f2_mar_2 shop2_2 -542.83 500 0 125.000 -67853.77
63 f2_apr_2 shop2_2 -559.19 500 0 500.000 -279594.99
64 f2_may_2 shop2_2 -519.06 500 0 122.500 -63584.94
65 f1 chips from mar 1.00 150 0 20.000 20.00
66 f1 unused chips 0.00 99999999 0 0.001 0.00
67 f2 chips from mar 1.00 150 0 0.000 0.00
68 f2 unused chips 0.00 99999999 0 280.000 0.00
===========
-1295542.72
166 ! Chapter 4: The INTPOINT Procedure
Example 4.6: Solving an LP Problem with Data in MPS Format
In this example, PROC INTPOINT is ultimately used to solve an LP. But prior to that, there is
SAS code that is used to read a MPS format le and initialize an input SAS data set. MPS was
an optimization package developed for IBM computers many years ago and the format by which
data had to be supplied to that system became the industry standard for other optimization software
packages, including those developed recently. The MPS format is described in Murtagh (1981). If
you have an LP which has data in MPS format in a le /your-directory/your-lename.dat, then the
following SAS code should be run:
filename w '/your-directorys/your-filename.dat';
data raw;
infile w lrecl=80 pad;
input field1 $ 2-3 field2 $ 5-12 field3 $ 15-22
field4 25-36 field5 $ 40-47 field6 50-61;
run;
%sasmpsxs;
data lp;
set;
if _type_="FREE" then _type_="MIN";
if lag(_type_)="
*
HS" then _type_="RHS";
run;
proc sort data=lp;
by _col_;
run;
proc intpoint
arcdata=lp
condata=lp sparsecondata rhsobs=rhs grouped=condata
conout=solutn /
*
SAS data set for the optimal solution
*
/
bytes=20000000
nnas=1700 ncoefs=4000 ncons=700
printlevel2=2 memrep;
run;
proc lp
data=lp sparsedata
endpause time=3600 maxit1=100000 maxit2=100000;
run;
show status;
quit;
You will have to specify the appropriate path and le name in which your MPS format data resides.
SASMPSXS is a SAS macro provided within SAS/OR software. The MPS format resembles the
sparse format of the CONDATA= data set for PROC INTPOINT. The SAS macro SASMPSXS
examines the MPS data and transfers it into a SAS data set while automatically taking into account
how the MPS format differs slightly from PROC INTPOINTs sparse format.
Example 4.7: Converting to an MPS-Format SAS Data Set ! 167
The parameters NNAS=1700, NCOEFS=4000, and NCONS=700 indicate the approximate (overesti-
mated) number of variables, coefcients and constraints this model has. You must change these to
your problems dimensions. Knowing these, PROC INTPOINT is able to utilize memory better and
read the data faster. These parameters are optional.
The PROC SORT preceding PROC INTPOINT is not necessary, but sorting the SAS data set can
speed up PROC INTPOINT when it reads the data. After the sort, data for each column is grouped
together. GROUPED=condata can be specied.
For small problems, presorting and specifying those additional options is not going to greatly
inuence PROC INTPOINTs run time. However, when problems are large, presorting and specifying
those additional options can be very worthwhile.
If you generate the model yourself, you will be familiar enough with it to know what to specify
for the RHSOBS= parameter. If the value of the SAS variable in the COLUMN list is equal to
the character string specied as the RHSOBS= option, the data in that observation is interpreted
as right-hand-side data as opposed to coefcient data. If you do not know what to specify for the
RHSOBS= option, you should rst run PROC LP and optionally set MAXIT1=1 and MAXIT2=1.
PROC LP will output a Problem Summary that includes the line
Rhs Variable rhs-charstr
BYTES=20000000 is the size of working memory PROC INTPOINT is allowed.
The options PRINTLEVEL2=2 and MEMREP indicate that you want to see an iteration log and
messages about memory usage. Specifying these options is optional.
Example 4.7: Converting to an MPS-Format SAS Data Set
This example demonstrates the use of the MPSOUT= option to convert a problem data set in PROC
INTPOINT input format into an MPS-format SAS data set for use with the OPTLP procedure.
Suppose you want to solve a linear program with the following formulation:
min 2.
1
3.
2
4.
3
subject to 2.
2
3.
3
_ 5
.
1
.
2
2.
3
_ 4
.
1
2.
2
3.
3
_ 7
0 _ .
1
_ 10
0 _ .
2
_ 15
0 _ .
3
_ 20
You can save the LP in dense format by using the following DATA step:
168 ! Chapter 4: The INTPOINT Procedure
data exdata;
input x1 x2 x3 _type_ $ _rhs_;
datalines;
2 -3 -4 min .
. -2 -3 >= -5
1 1 2 <= 6
1 2 3 >= 7
10 15 20 upperbd .
;
If you decide to solve the problem by using the OPTLP procedure, you need to convert the data
set exdata from dense format to MPS format. You can accomplish this by using the following
statements:
proc intpoint condata=exdata mpsout=mpsdata bytes=100000;
run;
The MPS-format SAS data set mpsdata is shown in Output 4.7.1.
Output 4.7.1 Data Set mpsdata
Obs field1 field2 field3 field4 field5 field6
1 NAME modname . .
2 ROWS . .
3 MIN objfn . .
4 G _OBS2_ . .
5 L _OBS3_ . .
6 G _OBS4_ . .
7 COLUMNS . .
8 x1 objfn 2 _OBS3_ 1
9 x1 _OBS4_ 1 .
10 x2 objfn -3 _OBS2_ -2
11 x2 _OBS3_ 1 _OBS4_ 2
12 x3 objfn -4 _OBS2_ -3
13 x3 _OBS3_ 2 _OBS4_ 3
14 RHS . .
15 _OBS2_ -5 _OBS3_ 6
16 _OBS4_ 7 .
17 BOUNDS . .
18 UP bdsvect x1 10 .
19 UP bdsvect x2 15 .
20 UP bdsvect x3 20 .
21 ENDATA . .
The constraint names _OBS2_, _OBS3_, and _OBS4_ are generated by the INTPOINT procedure. If
you want to provide your own constraint names, use the ROW list variable in the CONOUT= data
set. If you specify the problem data in sparse format instead of dense format, the MPSOUT= option
produces the same MPS-format SAS data set shown in the preceding output.
Now that the problem data is in MPS format, you can solve the problem by using the OPTLP
procedure. For more information, see Chapter 17, The OPTLP Procedure.
References ! 169
References
George, A., Liu, J., and Ng, E. (2001), Computer Solution of Positive Denite Systems, Unpub-
lished book obtainable from authors.
Lustig, I. J., Marsten, R. E., and Shanno, D. F. (1992), On Implementing Mehrotras Predictor-
Corrector Interior-Point Method for Linear Programming, SIAM Journal of Optimization, 2,
435449.
Murtagh, B. A. (1981), Advanced Linear Programming, Computation and Practice, New York:
McGraw-Hill.
Reid, J. K. (1975), A Sparsity-Exploiting Variant of the Bartels-Golub Decomposition for Linear
Programming Bases, Harwell Report CSS 20.
Roos, C., Terlaky, T., and Vial, J. (1997), Theory and Algorithms for Linear Optimization, Chichester,
England: John Wiley & Sons.
Wright, S. J. (1996), Primal-Dual Interior Point Algorithms, Philadelphia: SIAM.
Ye, Y. (1996), Interior Point Algorithms: Theory and Analysis, New York: John Wiley & Sons.
170
Chapter 5
The LP Procedure
Contents
Overview: LP Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
Getting Started: LP Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
An Introductory Example . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
An Integer Programming Example . . . . . . . . . . . . . . . . . . . . . . 180
An MPS Format to Sparse Format Conversion Example . . . . . . . . . . . 182
Syntax: LP Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
Functional Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
PROC LP Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
COEF Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
COL Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
ID Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
IPIVOT Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
PIVOT Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
PRINT Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
QUIT Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
RANGE Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
RESET Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
RHS Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
RHSSEN Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
ROW Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
RUN Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
SHOW Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
TYPE Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
VAR Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
Details: LP Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
Missing Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
Dense Data Input Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
Sparse Data Input Format . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
Converting Any PROC LP Format to an MPS-Format SAS Data Set . . . . . 210
Converting Standard MPS Format to Sparse Format . . . . . . . . . . . . . 210
The Reduced Costs, Dual Activities, and Current Tableau . . . . . . . . . . 213
Macro Variable _ORLP_ . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
Pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
172 ! Chapter 5: The LP Procedure
Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
Integer Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
Sensitivity Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
Range Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
Parametric Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
Interactive Facilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
Memory Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
Output Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
Input Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
Displayed Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
ODS Table and Variable Names . . . . . . . . . . . . . . . . . . . . . . . . 239
Memory Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
Examples: LP Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
Example 5.1: An Oil Blending Problem . . . . . . . . . . . . . . . . . . . 242
Example 5.2: A Sparse View of the Oil Blending Problem . . . . . . . . . . . 247
Example 5.3: Sensitivity Analysis: Changes in Objective Coefcients . . . . 250
Example 5.4: Additional Sensitivity Analysis . . . . . . . . . . . . . . . . 253
Example 5.5: Price Parametric Programming for the Oil Blending Problem . . 257
Example 5.6: Special Ordered Sets and the Oil Blending Problem . . . . . . . 261
Example 5.7: Goal-Programming a Product Mix Problem . . . . . . . . . . 263
Example 5.8: A Simple Integer Program . . . . . . . . . . . . . . . . . . . 270
Example 5.9: An Infeasible Problem . . . . . . . . . . . . . . . . . . . . . 273
Example 5.10: Restarting an Integer Program . . . . . . . . . . . . . . . . 275
Example 5.11: Alternative Search of the Branch-and-Bound Tree . . . . . . . 281
Example 5.12: An Assignment Problem . . . . . . . . . . . . . . . . . . . 286
Example 5.13: A Scheduling Problem . . . . . . . . . . . . . . . . . . . . 292
Example 5.14: A Multicommodity Transshipment Problem with Fixed Charges 299
Example 5.15: Converting to an MPS-Format SAS Data Set . . . . . . . . . 302
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
Overview: LP Procedure
The LP procedure solves linear programs, integer programs, and mixed-integer programs. It also
performs parametric programming, range analysis, and reports on solution sensitivity to changes in
the right-hand-side constants and price coefcients.
The LP procedure provides various control options and solution strategies. It also provides the
functionality to produce various kinds of intermediate and nal solution information. The procedures
interactive features enable you to take control of the problem solving process. During linear or
integer iterations, for example, you can stop the procedure at intermediate stages and examine current
results. If necessary, you can change options or strategies and resume the execution of the procedure.
Overview: LP Procedure ! 173
The LP procedure is used to optimize a linear function subject to linear and integer constraints.
Specically, the LP procedure solves the general mixed-integer program of the form
minimize c
T
.
subject to . {_. =. _] b
_ . _ u
.
i
is integer. i S
where
v is an m n matrix of technological coefcients
v b is an m 1 matrix of right-hand-side (RHS) constants
v c is an n 1 matrix of objective function coefcients
v . is an n 1 matrix of structural variables
v l is an n 1 matrix of lower bounds on .
v u is an n 1 matrix of upper bounds on .
v S is a subset of the set of indices {1. . . . . n]
Linear programs (when S is empty) are denoted by (LP). For these problems, the procedure employs
the two-phase revised simplex method, which uses the Bartels-Golub update of the LU decomposed
basis matrix to pivot between feasible solutions (Bartels 1971). In phase 1, PROC LP nds a basic
feasible solution to (LP), while in phase 2, PROC LP nds an optimal solution, .
o]t
. The procedure
implicitly handles unrestricted variables, lower-bounded variables, upper-bounded variables, and
ranges on constraints. When no explicit lower bounds are specied, PROC LP assumes that all
variables are bounded below by zero.
When a variable is specied as an integer variable, S has at least one element. The procedure then
uses the branch-and-bound technique for optimization.
The relaxed problem (the problem with no integer constraints) is solved initially using the primal
algorithm described previously. Constraints are added in dening the subsequent descendant prob-
lems in the branch-and-bound tree. These problems are then solved using the dual simplex algorithm.
Dual pivots are referred to as phase 3 pivots.
The preprocessing option enables the procedure to identify redundant and infeasible constraints,
x variables, and reduce the feasible region before solving a problem. For linear programs, the
option often can reduce the number of constraints and variables, leading to a quicker elapsed solution
time and improved reliability. For integer programs, it often reduces the gap between an integer
program and its relaxed linear program, which will likely lead to a reduced branch-and-bound tree
and a quicker CPU time. In general, it provides users an alternative to solving large, complicated
operations research problems.
The LP procedure can also analyze the sensitivity of the solution .
o]t
to changes in both the objective
function and the right-hand-side constants. There are three techniques available for this analysis:
174 ! Chapter 5: The LP Procedure
sensitivity analysis, parametric programming, and range analysis. Sensitivity analysis enables you to
examine the size of a perturbation to the right-hand-side or objective vector by an arbitrary change
vector for which the basis of the current optimal solution remains optimal.
Parametric programming, on the other hand, enables you to specify the size of the perturbation
beforehand and examine how the optimal solution changes as the desired perturbation is realized.
With this technique, the procedure pivots to maintain optimality as the right-hand-side or objective
vector is perturbed beyond the range for which the current solution is optimal. Range analysis is
used to examine the range of each right-hand-side value or objective coefcient for which the basis
of the current optimal solution remains optimal.
The LP procedure can also save both primal and dual solutions, the current tableau, and the branch-
and-bound tree in SAS data sets. This enables you to generate solution reports and perform additional
analyses with the SAS System. Although PROC LP reports solutions, this feature is particularly
useful for reporting solutions in formats tailored to your specic needs. Saving computational results
in a data set also enables you to continue executing a problem not solved because of insufcient time
or other computational problems.
The LP procedure uses the Output Delivery System (ODS), a SAS subsystem that provides capabili-
ties for displaying and controlling the output from SAS procedures. ODS enables you to modify the
headers, column names, data formats, and layouts of the output tables in PROC LP.
There are no restrictions on the problem size in the LP procedure. The number of constraints and
variables in a problem that PROC LP can solve depends on the host platform, the available memory,
and the available disk space for utility data sets.
You can also solve LP problems by using the OPTLP procedure. The OPTLP procedure requires a
linear program to be specied using a SAS data set that adheres to the MPS format, a widely accepted
format in the optimization community. You can use the MPSOUT= option in the LP procedure to
convert typical PROC LP format data sets into MPS-format SAS data sets.
Getting Started: LP Procedure
PROC LP expects the denition of one or more linear, integer, or mixed-integer programs in an input
data set. There are two formats, a dense format and a sparse format, for this data set.
In the dense format, a model is expressed in a similar way as it is formulated. Each SAS variable
corresponds to a models column, and each SAS observation corresponds to a models row. A SAS
variable in the input data set is one of the following:
v a type variable
v an id variable
v a structural variable
v a right-hand-side variable
An Introductory Example ! 175
v a right-hand-side sensitivity analysis variable or
v a range variable
The type variable tells PROC LP how to interpret the observation as a part of the mathematical
programming problem. It identies and classies objectives, constraints, and the rows that contain
information of variables like types, bounds, and so on. PROC LP recognizes the following keywords
as values for the type variable: MIN, MAX, EQ, LE, GE, SOSEQ, SOSLE, UNRSTRT, LOWERBD,
UPPERBD, FIXED, INTEGER, BINARY, BASIC, PRICESEN, and FREE. The values of the id
variable are the names of the rows in the model. The other variables identify and classify the columns
with numerical values.
The sparse format to PROC LP is designed to enable you to specify only the nonzero coefcients in
the description of linear programs, integer programs, and mixed-integer programs. The SAS data set
that describes the sparse model must contain at least four SAS variables:
v a type variable
v a column variable
v a row variable and
v a coefcient variable
Each observation in the data set associates a type with a row or a column, or denes a coefcient
or a numerical value in the model, or both. In addition to the keywords in the dense format, PROC
LP also recognizes the keywords RHS, RHSSEN, and RANGE as values of the type variable. The
values of the row and column variables are the names of the rows and columns in the model. The
values of the coefcient variables give the coefcients or other numerical data. The SAS data set
can contain multiple pairs of row and coefcient variables. In this way, more information about the
model can be specied in each observation in the data set. See the section Sparse Data Input Format
on page 208 for further discussion.
With both the dense and sparse formats for model specication, the observation order is not important.
This feature is particularly useful when using the sparse model input.
An Introductory Example
A simple blending problem illustrates the dense and sparse input formats and the use of PROC LP. A
step in rening crude oil into nished oil products involves a distillation process that splits crude into
various streams. Suppose there are three types of crude available: Arabian light, Arabian heavy, and
Brega. These types of crude are distilled into light naphtha, intermediate naphtha, and heating oil.
These in turn are blended into jet fuel using one of two recipes. What amounts of the three crudes
maximize the prot from producing jet fuel? A formulation to answer this question is as follows:
176 ! Chapter 5: The LP Procedure
maximize 175 a_light 165 a_heavy 205 brega 300 jet_1 300 jet_2
subject to .035 a_light .03 a_heavy .045 brega = naphthal
.1 a_light .075 a_heavy .135 brega = naphthai
.39 a_light .3 a_heavy .43 brega = heatingo
.3 naphthai .7 heatingo = jet_1
.2 naphthal .8 heatingo = jet_2
a_light _ 110
a_heavy _ 165
brega _ 80
a_light. a_heavy. brega. naphthai.
naphthal. heatingo. jet_1. jet_2 _ 0
The following data set gives the representation of this formulation. Notice that the variable names
are the structural variables, the rows are the constraints, and the coefcients are given as the values
for the structural variables.
data;
input _id_ $17.
a_light a_heavy brega naphthal naphthai
heatingo jet_1 jet_2
_type_ $ _rhs_;
datalines;
profit -175 -165 -205 0 0 0 300 300 max .
naphtha_l_conv .035 .030 .045 -1 0 0 0 0 eq 0
naphtha_i_conv .100 .075 .135 0 -1 0 0 0 eq 0
heating_o_conv .390 .300 .430 0 0 -1 0 0 eq 0
recipe_1 0 0 0 0 .3 .7 -1 0 eq 0
recipe_2 0 0 0 .2 0 .8 0 -1 eq 0
available 110 165 80 . . . . . upperbd .
;
The same model can be specied in the sparse format, as follows. This format enables you to omit
the zero coefcients.
data;
format _type_ $8. _col_ $8. _row_ $16. ;
input _type_ $ _col_ $ _row_ $ _coef_ ;
datalines;
max . profit .
eq . napha_l_conv .
eq . napha_i_conv .
eq . heating_oil_conv .
An Introductory Example ! 177
eq . recipe_1 .
eq . recipe_2 .
upperbd . available .
. a_light profit -175
. a_light napha_l_conv .035
. a_light napha_i_conv .100
. a_light heating_oil_conv .390
. a_light available 110
. a_heavy profit -165
. a_heavy napha_l_conv .030
. a_heavy napha_i_conv .075
. a_heavy heating_oil_conv .300
. a_heavy available 165
. brega profit -205
. brega napha_l_conv .045
. brega napha_i_conv .135
. brega heating_oil_conv .430
. brega available 80
. naphthal napha_l_conv -1
. naphthal recipe_2 .2
. naphthai napha_i_conv -1
. naphthai recipe_1 .3
. heatingo heating_oil_conv -1
. heatingo recipe_1 .7
. heatingo recipe_2 .8
. jet_1 profit 300
. jet_1 recipe_1 -1
. jet_2 profit 300
. jet_2 recipe_2 -1
. _rhs_ recipe_1 0
;
Because the input order of the model into PROC LP is unimportant, this model can be specied
in sparse input in arbitrary row order. Example 5.2 in the section Examples: LP Procedure on
page 241 demonstrates this.
The dense and sparse forms of model input give you exibility to generate models using the SAS
language. The dense form of the model is solved with the statements
proc lp;
run;
The sparse form is solved with the statements
proc lp sparsedata;
run;
Example 5.1 and Example 5.2 in the section Examples: LP Procedure on page 241 continue with
this problem.
178 ! Chapter 5: The LP Procedure
Problem Input
As default, PROC LP uses the most recently created SAS data set as the problem input data set.
However, if you want to input the problem from a specic SAS data set, use the DATA= option. For
example, if the previous dense form data set has the name DENSE, the PROC LP statements can be
written as
proc lp data=dense;
run;
Problem Denition Statements
In the previous dense form data set, the _ID_, _TYPE_, and _RHS_ variables are special variables
in PROC LP. They stand for id variable, type variable, and right-hand-side variable. If you replace
those variable names with, for example, ROWNAME, TYPE, and RHS, you need the problem denition
statements (ID, TYPE and RHS) in PROC LP:
proc lp;
id rowname;
type type;
rhs rhs;
run;
Other special variables for the dense format are _RHSSEN_ and _RANGE_, which identify the vectors
for the right-hand-side sensitivity and range analyses. The corresponding statements are the RHSSEN
and RANGE statements. (Notice that a variable name can be identical to a statement name.)
In the same way, if you replace the variables _COL_, _ROW_, _TYPE_, and _COEF_ in the previous
sparse form data set by COLUMN, ROW, TYPE, and COEF, you need the problem denition statements
(COL, ROW, TYPE, and COEF) in PROC LP.
proc lp sparsedata;
col column;
row row;
type type;
coef coef;
run;
In the sparse form data set, the value _RHS_ under the variable _COL_ is a special column name,
which represents the models right-hand-side column. If you replace it by a value R, the PROC LP
statements would be
proc lp sparsedata;
rhs r;
run;
An Introductory Example ! 179
Other special column names for the sparse format are _RHSSEN_ and _RANGE_. The corre-
sponding statements are the RHSSEN and RANGE statements.
PROC LP is case insensitive to variable names and all character values, including the row and column
names in the sparse format. The order of the problem denition statements is not important.
For the dense format, a models row names appear as character values in a SAS data set. For the
sparse format, both the row and the column names of the model appear as character values in the
data set. Thus, you can put spaces or other special characters in the names. When referring to these
names in the problem denition statement or other LP statements, you must use single or double
quotes around them. For example, if you replace _RHS_ by R H S in the previous sparse form
data set, the PROC LP statements would become
proc lp sparsedata;
rhs "r h s";
run;
LP Options
The specications SPARSEDATA and DATA= in the previous examples are PROC LP options.
PROC LP options include
v data set options
v display control options
v interactive control options
v preprocessing control options
v branch-and-bound control options
v sensitivity/parametric/ranging control options
v simplex algorithm control options
Interactive Processing
Interactive control options include READPAUSE, ENDPAUSE, and so forth. You can run PROC LP
interactively using those options. For example, for the blending problem example in the dense form,
you can rst pause the procedure before iterations start with the READPAUSE option. The PROC
LP statements are
proc lp readpause;
run;
180 ! Chapter 5: The LP Procedure
When the procedure pauses, you run the PRINT statement to display the initial technological matrix
and see if the input is correct. Then you run the PIVOT statement to do one simplex pivot and pause.
After that you use the SHOW statement to check the current solution status. Then you apply the
RESET statement to tell the procedure to stop as soon as it nds a solution. Now you use the RUN
statement to continue the execution. When the procedure stops, you run the PRINT statement again
to do a price range analysis and QUIT the procedure. Use a SAS %PUT statement to display the
contents of PROC LPs macro variable, _ORLP_, which contains iterations and solution information.
What follows are the complete statements in batch mode:
proc lp readpause;
run;
print matrix(,); /
*
display all rows and columns.
*
/
pivot;
show status;
reset endpause;
run;
print rangeprice;
quit;
%put &_orlp_;
NOTE: You can force PROC LP to pause during iterations by using the CTRL-BREAK key.
An Integer Programming Example
The following is a simple mixed-integer programming problem. Details can be found in Example 5.8
in the section Examples: LP Procedure on page 241.
data;
format _row_ $10.;
input _row_ $ choco gumdr ichoco igumdr _type_ $ _rhs_;
datalines;
object .25 .75 -100 -75 max .
cooking 15 40 0 0 le 27000
color 0 56.25 0 0 le 27000
package 18.75 0 0 0 le 27000
condiments 12 50 0 0 le 27000
chocolate 1 0 -10000 0 le 0
gum 0 1 0 -10000 le 0
only_one 0 0 1 1 eq 1
binary . . 1 2 binary .
;
The row with binary type indicates that this problem is a mixed-integer program and all the integer
variables are binary. The integer values of the row set an ordering for PROC LP to pick the branching
variable when VARSELECT=PRIOR is chosen. Smaller values will have higher priorities. The
_ROW_ variable here is an alias of the _ID_ variable.
An Integer Programming Example ! 181
This problem can be solved with the following statements:
proc lp canselect=lifo backtrack=obj varselect=far endpause;
run;
quit;
%put &_orlp_;
The options CANSELECT=, BACKTRACK=, and VARSELECT= specify the rules for picking the
next active problem and the rule to choose the branching variable. In this example, the values LIFO,
OBJ and FAR serve as the default values, so the three options can be omitted from the PROC LP
statement. The following is the output from the %PUT statement:
STATUS=SUCCESSFUL PHASE=3 OBJECTIVE=285 P_FEAS=YES D_FEAS=YES INT_ITER=3
INT_FEAS=2 ACTIVE=0 INT_BEST=285 PHASE1_ITER=0 PHASE2_ITER=5
PHASE3_ITER=5
Preprocessing
Using the PREPROCESS= option, you can apply the preprocessing techniques to pre-solve and then
solve the preceding mixed-integer program:
proc lp preprocess=1 endpause;
run;
quit;
%put &_orlp_;
The preprocessing statistics are written to the SAS log le as follows:
NOTE: Preprocessing 1 ...
NOTE: 2 upper bounds decreased.
NOTE: 2 coefficients reduced.
NOTE: Preprocessing 2 ...
NOTE: 2 constraints eliminated.
NOTE: Preprocessing done.
The new output _ORLP_ is as follows:
STATUS=SUCCESSFUL PHASE=3 OBJECTIVE=285 P_FEAS=YES D_FEAS=YES INT_ITER=0
INT_FEAS=1 ACTIVE=0 INT_BEST=285 PHASE1_ITER=0 PHASE2_ITER=5
PHASE3_ITER=0
In this example, the number of integer iterations (INT_ITER=) is zero, which means that the
preprocessing has reduced the gap between the relaxed linear problem and the mixed-integer program
to zero.
182 ! Chapter 5: The LP Procedure
An MPS Format to Sparse Format Conversion Example
If your model input is in MPS input format, you can convert it to the sparse input format of PROC
LP using the SAS macro function SASMPSXS. For example, if your have an MPS le called
MODEL.MPS and it is stored in the directory C:\OR on a PC, the following program can help you to
convert the le and solve the problem.
%sasmpsxs(mpsfile="c:\or\model.mps",lpdata=lp);
data;
set lp;
retain i 1;
if _type_="FREE" and i=1 then
do;
_type_="MIN";
i=0;
end;
run;
proc lp sparsedata;
run;
In the MPS input format, all objective functions, price change rows, and free rows have the type N.
The SASMPSXS macro marks them as FREE rows. After the conversion, you must run a DATA
step to identify the objective rows and price change rows. In this example, assume that the problem
is one of minimization and the rst FREE row is an objective row.
Syntax: LP Procedure ! 183
Syntax: LP Procedure
Below are statements used in PROC LP, listed in alphabetical order as they appear in the text that
follows.
PROC LP options ;
COEF variables ;
COL variable ;
ID variable(s) ;
IPIVOT ; ;
PIVOT ; ;
PRINT options ;
QUIT options ;
RANGE variable ;
RESET options ;
RHS variables ;
RHSSEN variables ;
ROW variable(s) ;
RUN ; ;
SHOW options ;
TYPE variable ;
VAR variables ;
The TYPE, ID (or ROW), VAR, RHS, RHSSEN, and RANGE statements are used for identifying
variables in the problem data set when the model is in the dense input format. In the dense input
format, a models variables appear as variables in the problem data set. The TYPE, ID (or ROW),
and RHS statements can be omitted if the input data set contains variables _TYPE_, _ID_ (or _ROW_),
and _RHS_; otherwise, they must be used. The VAR statement is optional. When it is omitted,
PROC LP treats all numeric variables that are not explicitly or implicitly included in RHS, RHSSEN,
and RANGE statements as structural variables. The RHSSEN and RANGE statements are optional
statements for sensitivity and range analyses. They can be omitted if the input data set contains the
_RHSSEN_ and _RANGE_ variables.
The TYPE, COL, ROW (or ID), COEF, RHS, RHSSEN, and RANGE statements are used for
identifying variables in the problem data set when the model is in the sparse input format. In the
sparse input format, a models rows and columns appear as observations in the problem data set. The
TYPE, COL, ROW (or ID), and COEF statements can be omitted if the input data set contains the
_TYPE_ and _COL_ variables, as well as variables beginning with the prexes _ROW (or _ID) and
_COEF. Otherwise, they must be used. The RHS, RHSSEN, and RANGE statements identify the
corresponding columns in the model. These statements can be omitted if there are observations that
contain the RHS, RHSSEN, and RANGE types or the _RHS_, _RHSSEN_, and _RANGE_ column
values.
The SHOW, RESET, PRINT, QUIT, PIVOT, IPIVOT, and RUN statements are especially useful
when executing PROC LP interactively. However, they can also be used in batch mode.
184 ! Chapter 5: The LP Procedure
Functional Summary
The statements and options available with PROC LP are summarized by purpose in the following
table.
Table 5.1 Functional Summary
Description Statement Option
Interactive Statements:
Perform one integer pivot and pause IPIVOT
Perform one simplex pivot and pause PIVOT
Display information at current iteration PRINT
Terminate processing immediately QUIT
Reset options specied RESET
Start or resume optimization RUN
Show settings of options SHOW
Variable Lists:
Variables that contain coefcients (sparse) COEF
Variable that contains column names (sparse) COL
Alias for the ROW statement ID
Variable (column) that contains the range constant
for the dense (sparse) format
RANGE
Variables (columns) that contains RHS constants
for the dense (sparse) format
RHS
Variables (columns) that dene RHS change vec-
tors for the dense (sparse) format
RHSSEN
Variable that contains names of constraints and
objective functions (names of rows) for the dense
(sparse) format
ROW
Variable that contains the type of each observation TYPE
Structural variables (dense) VAR
Data Set Options:
Active nodes input data set PROC LP ACTIVEIN=
Active nodes output data set PROC LP ACTIVEOUT=
Input data set PROC LP DATA=
Dual output data set PROC LP DUALOUT=
Primal input data set PROC LP PRIMALIN=
Primal output data set PROC LP PRIMALOUT=
Sparse format data input ag PROC LP SPARSEDATA
Tableau output data set PROC LP TABLEAUOUT=
Convert sparse or dense format input data set into
MPS-format output data set
PROC LP MPSOUT=
Functional Summary ! 185
Description Statement Option
Display Control Options:
Display iteration log PROC LP FLOW
Nonzero tolerance displaying PROC LP FUZZ=
Inverse of FLOW option PROC LP NOFLOW
Inverse of PARAPRINT option PROC LP NOPARAPRINT
Omit some displaying PROC LP NOPRINT
Inverse of TABLEAUPRINT PROC LP NOTABLEAUPRINT
Parametric programming displaying PROC LP PARAPRINT
Inverse of NOPRINT PROC LP PRINT
Iteration frequency of display PROC LP PRINTFREQ=
Level of display desired PROC LP PRINTLEVEL=
Display the nal tableau PROC LP TABLEAUPRINT
Interactive Control Options:
Pause before displaying the solution PROC LP ENDPAUSE
Pause after rst feasible solution PROC LP FEASIBLEPAUSE
Pause frequency of integer solutions PROC LP IFEASIBLEPAUSE=
Pause frequency of integer iterations PROC LP IPAUSE=
Inverse of ENDPAUSE PROC LP NOENDPAUSE
Inverse of FEASIBLEPAUSE PROC LP NOFEASIBLEPAUSE
Pause frequency of iterations PROC LP PAUSE=
Pause if within specied proximity PROC LP PROXIMITYPAUSE=
Pause after data is read PROC LP READPAUSE
Preprocessing Control Options:
Do not perform preprocessing PROC LP NOPREPROCESS
Preprocessing error tolerance PROC LP PEPSILON=
Limit preprocessing iterations PROC LP PMAXIT=
Perform preprocessing techniques PROC LP PREPROCESS
Branch-and-Bound (BB) Control Options:
Perform automatic node selection technique PROC LP AUTO
Backtrack strategy to be used PROC LP BACKTRACK=
Branch on binary variables rst PROC LP BINFST
Active node selection strategy PROC LP CANSELECT=
Comprehensive node selection control parameter PROC LP CONTROL=
Backtrack related technique PROC LP DELTAIT=
Measure for pruning BB tree PROC LP DOBJECTIVE=
Integer tolerance PROC LP IEPSILON=
Limit integer iterations PROC LP IMAXIT=
Measure for pruning BB tree PROC LP IOBJECTIVE=
Order of two branched nodes in adding to BB tree PROC LP LIFOTYPE=
Inverse of AUTO PROC LP NOAUTO
Inverse of BINFST PROC LP NOBINFST
Inverse of POSTPROCESS PROC LP NOPOSTPROCESS
186 ! Chapter 5: The LP Procedure
Description Statement Option
Limit number of branching variables PROC LP PENALTYDEPTH=
Measure for pruning BB tree PROC LP POBJECTIVE=
Perform variables xing technique PROC LP POSTPROCESS
Percentage used in updating WOBJECTIVE PROC LP PWOBJECTIVE=
Compression algorithm for storing active nodes PROC LP TREETYPE=
Branching variable selection strategy PROC LP VARSELECT=
Delay examination of some active nodes PROC LP WOBJECTIVE=
Sensitivity/Parametric/Ranging Control Options:
Inverse of RANGEPRICE PROC LP NORANGEPRICE
Inverse of RANGERHS PROC LP NORANGERHS
Limit perturbation of the price vector PROC LP PRICEPHI=
Range analysis on the price coefcients PROC LP RANGEPRICE
Range analysis on the RHS vector PROC LP RANGERHS
Limit perturbation of the RHS vector PROC LP RHSPHI=
Simplex Algorithm Control Options:
Use devex method PROC LP DEVEX
General error tolerance PROC LP EPSILON=
Perform goal programming PROC LP GOALPROGRAM
Largest number used in computation PROC LP INFINITY=
Reinversion frequency PROC LP INVFREQ=
Reinversion tolerance PROC LP INVTOL=
Simultaneously set MAXIT1, MAXIT2, MAXIT3
and IMAXIT values
PROC LP MAXIT=
Limit phase 1 iterations PROC LP MAXIT1=
Limit phase 2 iterations PROC LP MAXIT2=
Limit phase 3 iterations PROC LP MAXIT3=
Inverse of devex PROC LP NODEVEX
Restore basis after parametric programming PROC LP PARARESTORE
Weight of the phase 2 objective function in phase 1 PROC LP PHASEMIX=
Multiple pricing strategy PROC LP PRICETYPE=
Number of columns to subset in multiple pricing PROC LP PRICE=
Limit the number of iterations randomly selecting
each entering variable during phase 1
PROC LP RANDOMPRICEMULT=
Zero tolerance in ratio test PROC LP REPSILON=
Scaling type to be performed PROC LP SCALE=
Zero tolerance in LU decomposition PROC LP SMALL=
Time pause limit PROC LP TIME=
Control pivoting during LU decomposition PROC LP U=
RESET Statement Options:
The RESET statement supports the same options as the PROC LP statement except for
the DATA=, PRIMALIN=, and ACTIVEIN= options, and supports the following additional
options:
Functional Summary ! 187
Description Statement Option
New variable lower bound during phase 3 RESET LOWER=
New variable upper bound during phase 3 RESET UPPER=
PRINT Statement Options:
Display the best integer solution PRINT BEST
Display variable summary for specied columns PRINT COLUMN
Display variable summary and price sensitivity
analysis for specied columns
PRINT COLUMN / SENSITIVITY
Display variable summary for integer variables PRINT INTEGER
Display variable summary for nonzero integer vari-
ables
PRINT INTEGER_NONZEROS
Display variable summary for integer variables
with zero activity
PRINT INTEGER_ZEROS
Display submatrix for specied rows and columns PRINT MATRIX
Display formatted submatrix for specied rows
and columns
PRINT MATRIX / PICTURE
Display variable summary for continuous variables PRINT NONINTEGER
Display variable summary for nonzero continuous
variables
PRINT NONINTEGER_NONZEROS
Display variable summary for variables with
nonzero activity
PRINT NONZEROS
Display price sensitivity analysis or price paramet-
ric programming
PRINT PRICESEN
Display price range analysis PRINT RANGEPRICE
Display RHS range analysis PRINT RANGERHS
Display RHS sensitivity analysis or RHS paramet-
ric programming
PRINT RHSSEN
Display constraint summary for specied rows PRINT ROW
Display constraint summary and RHS sensitivity
analysis for specied rows
PRINT ROW / SENSITIVITY
Display solution, variable, and constraint sum-
maries
PRINT SOLUTION
Display current tableau PRINT TABLEAU
Display variables with zero activity PRINT ZEROS
SHOW Statement Options:
Display options applied SHOW OPTIONS
Display status of the current solution SHOW STATUS
QUIT Statement Option:
Save the dened output data sets and then termi-
nate PROC LP
QUIT / SAVE
188 ! Chapter 5: The LP Procedure
PROC LP Statement
PROC LP options ;
This statement invokes the procedure. The following options can appear in the PROC LP statement.
Data Set Options
ACTIVEIN=SAS-data-set
names the SAS data set containing the active nodes in a branch-and-bound tree that is to be
used to restart an integer program.
ACTIVEOUT=SAS-data-set
names the SAS data set in which to save the current branch-and-bound tree of active nodes.
DATA=SAS-data-set
names the SAS data set containing the problem data. If the DATA= option is not specied,
PROC LP uses the most recently created SAS data set.
DUALOUT=SAS-data-set
names the SAS data set that contains the current dual solution (shadow prices) on termination
of PROC LP. This data set contains the current dual solution only if PROC LP terminates
successfully.
MPSOUT=SAS-data-set
names the SAS data set that contains converted sparse or dense format input data in MPS
format. Invoking this option directs the LP procedure to halt before attempting optimization.
For more information about the MPS-format SAS data set, see Chapter 16, The MPS-Format
SAS Data Set.
PRIMALIN=SAS-data-set
names the SAS data set that contains a feasible solution to the problem dened by the DATA=
data set. The data set specied in the PRIMALIN= option should have the same format as
a data set saved using the PRIMALOUT= option. Specifying the PRIMALIN= option is
particularly useful for continuing iteration on a problem previously attempted. It is also useful
for performing sensitivity analysis on a previously solved problem.
PRIMALOUT=SAS-data-set
names the SAS data set that contains the current primal solution when PROC LP terminates.
SPARSEDATA
tells PROC LP that the data are in the sparse input format. If this option is not specied, PROC
LP assumes that the data are in the dense input format. See the section Sparse Data Input
Format on page 208 for information about the sparse input format.
TABLEAUOUT=SAS-data-set
names the SAS data set in which to save the nal tableau.
PROC LP Statement ! 189
Display Control Options
FLOW
requests that a journal of pivot information (the Iteration Log) be displayed after every
PRINTFREQ= iterations. This includes the names of the variables entering and leaving the
basis, the reduced cost of the entering variable, and the current objective value.
FUZZ=e
displays all numbers within e of zero as zeros. The default value is 1.0E10.
NOFLOW
is the inverse of the FLOW option.
NOPARAPRINT
is the inverse of the PARAPRINT option.
NOPRINT
suppresses the display of the Variable, Constraint, and Sensitivity Analysis summaries. This
option is equivalent to the PRINTLEVEL=0 option.
NOTABLEAUPRINT
is the inverse of the TABLEAUPRINT option.
PARAPRINT
indicates that the solution be displayed at each pivot when performing parametric programming.
PRINT
is the inverse of the NOPRINT option.
PRINTFREQ=m
indicates that after every mth iteration, a line in the (Integer) Iteration Log be displayed. The
default value is 1.
PRINTLEVEL=i
indicates the amount of displaying that the procedure should perform.
PRINTLEVEL=-2 only messages to the SAS log are displayed
PRINTLEVEL=-1 is equivalent to NOPRINT unless the problem is infeasible. If it
is infeasible, the infeasible rows are displayed in the Constraint
Summary along with the Infeasible Information Summary.
PRINTLEVEL=0 is identical to NOPRINT
PRINTLEVEL=1 all output is displayed
The default value is 1.
TABLEAUPRINT
indicates that the nal tableau be displayed.
190 ! Chapter 5: The LP Procedure
Interactive Control Options
ENDPAUSE
requests that PROC LP pause before displaying the solution. When this pause occurs, you can
enter the RESET, SHOW, PRINT, RUN, and QUIT statements.
FEASIBLEPAUSE
requests that PROC LP pause after a feasible (not necessarily integer feasible) solution has
been found. At a pause, you can enter the RESET, SHOW, PRINT, PIVOT, RUN, and QUIT
statements.
IFEASIBLEPAUSE=n
requests that PROC LP pause after every n integer feasible solutions. At a pause, you can enter
the RESET, SHOW, PRINT, IPIVOT, PIVOT, RUN, and QUIT statements. The default value
is 99999999.
IPAUSE=n
requests that PROC LP pause after every n integer iterations. At a pause, you can enter RESET,
SHOW, PRINT, IPIVOT, PIVOT, RUN, and QUIT statements. The default value is 99999999.
NOENDPAUSE
is the inverse of the ENDPAUSE option.
NOFEASIBLEPAUSE
is the inverse of the FEASIBLEPAUSE option.
PAUSE=n
requests that PROC LP pause after every n iterations. At a pause, you can enter the RESET,
SHOW, PRINT, IPIVOT, PIVOT, RUN, and QUIT statements. The default value is 99999999.
PROXIMITYPAUSE=r
causes the procedure to pause if at least one integer feasible solution has been found and the
objective value of the current best integer solution can be determined to be within r units of
the optimal integer solution. This distance, called proximity, is also displayed on the Integer
Iteration Log. Note that the proximity is calculated using the minimum (maximum if the
problem is maximization) objective value among the nodes that remain to be explored in the
branch-and-bound tree as a bound on the value of the optimal integer solution. Following
the rst PROXIMITYPAUSE= pause, in order to avoid a pause at every iteration thereafter,
it is recommended that you reduce this measure through the use of a RESET statement.
Otherwise, if any other option or statement that causes the procedure to pause is used while
the PROXIMITYPAUSE= option is in effect, pause interferences may occur. When this
pause occurs, you can enter the RESET, SHOW, PRINT, IPIVOT, PIVOT, RUN, and QUIT
statements. The default value is 0.
READPAUSE
requests that PROC LP pause after the data have been read and the initial basis inverted. When
this pause occurs, you can enter the RESET, SHOW, PRINT, IPIVOT, PIVOT, RUN, and
QUIT statements.
PROC LP Statement ! 191
Preprocessing Control Options
NOPREPROCESS
is the inverse of the PREPROCESS option.
PEPSILON=e
species a positive number close to zero. This value is an error tolerance in the preprocessing.
If the value is too small, any marginal changes may cause the preprocessing to repeat itself.
However, if the value is too large, it may alter the optimal solution or falsely claim that the
problem is infeasible. The default value is 1.0E8.
PMAXIT=n
performs at most n preprocessings. Preprocessing repeats itself if it improves some bounds or
xes some variables. However when a problem is large and dense, each preprocessing may
take a signicant amount of CPU time. This option limits the number of preprocessings PROC
LP performs. It can also reduce the build-up of round-off errors. The default value is 100.
PREPROCESS
performs preprocessing techniques. See the section Preprocessing on page 216 for further
discussion.
Branch-and-Bound Algorithm Control Options
AUTO, AUTO(m,n)
automatically sets and adjusts the value of the CONTROL= option. Initially, it sets CON-
TROL=0.70, concentrating on nding an integer feasible solution or an upper bound. When an
upper bound is found, it sets CONTROL=0.5, concentrating on efciency and lower bound
improvement. When the number of active problems exceeds m, it starts to gradually increase
the value of the CONTROL= option to keep the size of active problems under control. When
total active problems exceed n, CONTROL=1 will keep the active problems from growing
further. You can alter the automatic process by resetting the value of the CONTROL= option
interactively.
The default values of m and n are 20000 and 250000, respectively. You can change the two
values according to your computers space and memory capacities.
BACKTRACK=rule
species the rule used to choose the next active problem when backtracking is required. One
of the following can be specied:
v BACKTRACK=LIFO
v BACKTRACK=FIFO
v BACKTRACK=OBJ
v BACKTRACK=PROJECT
v BACKTRACK=PSEUDOC
v BACKTRACK=ERROR
192 ! Chapter 5: The LP Procedure
The default value is OBJ. See the section Integer Programming on page 217 for further
discussion.
BINFST
requests that PROC LP branch on binary variables rst when integer and binary variables
are present. The reasoning behind this is that a subproblem will usually be fathomed or
found integer feasible after less than 20% of its variables have been xed. Considering binary
variables rst attempts to reduce the size of the branch-and-bound tree. It is a heuristic
technique.
CANSELECT=rule
species the rule used to choose the next active problem when backtracking is not required or
used. One of the following can be specied:
v CANSELECT=LIFO
v CANSELECT=FIFO
v CANSELECT=OBJ
v CANSELECT=PROJECT
v CANSELECT=PSEUDOC
v CANSELECT=ERROR
The default value is LIFO. See the section Integer Programming on page 217 for further
discussion.
CONTROL=r
species a number between 0 and 1. This option combines CANSELECT= and other rules to
choose the next active problem. It takes into consideration three factors: efciency, improving
lower bounds, and improving upper bounds. When r is close to 0, PROC LP concentrates on
improving lower bounds (upper bounds for maximization). However, the efciency per integer
iteration is usually the worst. When r is close to 1, PROC LP concentrates on improving upper
bounds (lower bounds for maximization). In addition, the growth of active problems will be
controlled and stopped at r = 1. When its value is around 0.5, PROC LP will be in the most
efcient state in terms of CPU time and integer number of iterations. The CONTROL= option
will be automatically adjusted when the AUTO option is applied.
DELTAIT=r
is used to modify the exploration of the branch-and-bound tree. If more than r integer iterations
have occurred since the last integer solution was found, then the procedure uses the backtrack
strategy in choosing the next node to be explored. The default value is 3 times the number of
integer variables.
DOBJECTIVE=r
species that PROC LP should discard active nodes that cannot lead to an integer solution with
the objective at least as small (or as large for maximizations) as the objective of the relaxed
problem plus (minus) r. The default value is o.
IEPSILON=e
requests that PROC LP consider an integer variable as having an integer value if its value is
within e units of an integer. The default value is 1.0E7.
PROC LP Statement ! 193
IMAXIT=n
performs at most n integer iterations. The default value is 100.
IOBJECTIVE=r
species that PROC LP should discard active nodes unless the node could lead to an integer
solution with the objective smaller (or larger for maximizations) than r. The default value is
ofor minimization (ofor maximization).
LIFOTYPE=c
species the order in which to add the two newly branched active nodes to the LIFO list.
LIFOTYPE=0 add the node with minimum penalty rst
LIFOTYPE=1 add the node with maximum penalty rst
LIFOTYPE=2 add the node resulting from adding .
i
_ {.
opt
(k)
i
rst
LIFOTYPE=3 add the node resulting from adding .
i
_ ].
opt
(k)
i
rst
The default value is 0.
NOAUTO
is the inverse of the AUTO option.
NOBINFST
is the inverse of the BINFST option.
NOPOSTPROCESS
is the inverse of the POSTPROCESS option.
PENALTYDEPTH=m
requests that PROC LP examine m variables as branching candidates when VARSE-
LECT=PENALTY. If the PENALTYDEPTH= option is not specied when VARSE-
LECT=PENALTY, then all of the variables are considered branching candidates. The default
value is the number of integer variables. See the section Integer Programming on page 217
for further discussion.
POBJECTIVE=r
species that PROC LP should discard active nodes that cannot lead to an integer solution with
objective at least as small as o [ o [ r (at least as large as o [ o [ r for maximizations)
where o is the objective of the relaxed noninteger constrained problem. The default value is
o.
POSTPROCESS
attempts to x binary variables globally based on the relationships among the reduced cost
and objective value of the relaxed problem and the objective value of the current best integer
feasible solution.
PWOBJECTIVE=r
species a percentage for use in the automatic update of the WOBJECTIVE= option. If the
WOBJECTIVE= option is not specied in PROC LP, then when an integer feasible solution is
found, the value of the option is updated to be b + q r where b is the best bound on the value
194 ! Chapter 5: The LP Procedure
of the optimal integer solution and q is the current proximity. Note that for maximizations, b -
q r is used. The default value is 0.95.
TREETYPE=i
species a data compression algorithm.
TREETYPE=0 no data compression
TREETYPE=1 Huffman coding compression routines
TREETYPE=2 adaptive Huffman coding compression routines
TREETYPE=3 adaptive arithmetic coding compression routines
For IP or MIP problems, the basis and bounds information of each active node is saved to a
utility le. When the number of active nodes increases, the size of the utility le becomes
larger and larger. If PROC LP runs into a disk problem, like disk full . . . or writing failure
. . . , you can use this option to compress the utility le. For more information on the data
compression routines, refer to Nelson (1992). The default value is 0.
VARSELECT=rule
species the rule used to choose the branching variable on an integer iteration.
v VARSELECT=CLOSE
v VARSELECT=PRIOR
v VARSELECT=PSEUDOC
v VARSELECT=FAR
v VARSELECT=PRICE
v VARSELECT=PENALTY
The default value is FAR. See the section Integer Programming on page 217 for further
discussion.
WOBJECTIVE=r
species that PROC LP should delay examination of active nodes that cannot lead to an integer
solution with objective at least as small (as large for maximizations) as r, until all other active
nodes have been explored. The default value is ofor minimization (ofor maximization).
Sensitivity/Parametric/Ranging Control Options
NORANGEPRICE
is the inverse of the RANGEPRICE option.
NORANGERHS
is the inverse of the RANGERHS option.
PRICEPHI=
species the limit for parametric programming when perturbing the price vector. See the
section Parametric Programming on page 228 for further discussion. See Example 5.5 for
an illustration of this option.
PROC LP Statement ! 195
RANGEPRICE
indicates that range analysis is to be performed on the price coefcients. See the section
Range Analysis on page 228 for further discussion.
RANGERHS
indicates that range analysis is to be performed on the right-hand-side vector. See the section
Range Analysis on page 228 for further discussion.
RHSPHI=
species the limit for parametric programming when perturbing the right-hand-side vector.
See the section Parametric Programming on page 228 for further discussion.
Simplex Algorithm Control Options
DEVEX
indicates that the devex method of weighting the reduced costs be used in pricing (Harris
1975).
EPSILON=e
species a positive number close to zero. It is used in the following instances:
During phase 1, if the sum of the basic articial variables is within e of zero, the current
solution is considered feasible. If this sum is not exactly zero, then there are articial variables
within e of zero in the current solution. In this case, a note is displayed on the SAS log.
During phase 1, if all reduced costs are _ e for nonbasic variables at their lower bounds and _
e for nonbasic variables at their upper bounds and the sum of infeasibilities is greater than e,
then the problem is considered infeasible. If the maximum reduced cost is within e of zero, a
note is displayed on the SAS log.
During phase 2, if all reduced costs are _ e for nonbasic variables at their lower bounds and _
e for nonbasic variables at their upper bounds, then the current solution is considered optimal.
During phases 1, 2, and 3, the EPSILON= option is also used to test if the denominator is
different from zero before performing the ratio test to determine which basic variable should
leave the basis.
The default value is 1.0E8.
GOALPROGRAM
species that multiple objectives in the input data set are to be treated as sequential objectives
in a goal-programming model. The value of the right-hand-side variable in the objective row
gives the priority of the objective. Lower numbers have higher priority.
INFINITY=r
species the largest number PROC LP uses in computation. The INFINITY= option is used to
determine when a problem has an unbounded variable value. The default value is the largest
double precision number.
1
1
This value is system dependent.
196 ! Chapter 5: The LP Procedure
INVFREQ=m
reinverts the current basis matrix after m major and minor iterations. The default value is 100.
INVTOL=r
reinverts the current basis matrix if the largest element in absolute value in the decomposed
basis matrix is greater than r. If after reinversion this condition still holds, then the value of
the INVTOL= option is increased by a factor of 10 and a note indicating this modication
is displayed on the SAS log. When r is frequently exceeded, this may be an indication of a
numerically unstable problem. The default value is 1000.
MAXIT=n
simultaneously sets the values of the MAXIT1=, MAXIT2=, MAXIT3=, and IMAXIT=
options.
MAXIT1=n
performs at most n _ 0 phase 1 iterations. The default value is 100.
MAXIT2=n
performs at most n _ 0 phase 2 iterations. If MAXIT2=0, then only phase 1 is entered so that
on successful termination PROC LP will have found a feasible, but not necessarily optimal,
solution. The default value is 100.
MAXIT3=n
performs at most n _ 0 phase 3 iterations. All dual pivots are counted as phase 3 pivots. The
default value is 99999999.
NODEVEX
is the inverse of the DEVEX option.
PARARESTORE
indicates that following a parametric programming analysis, PROC LP should restore the
basis.
PHASEMIX=r
species a number between 0 and 1. When the number is positive, PROC LP tries to improve
the objective function of phase 2 during phase 1. The PHASEMIX= option is a weight factor
of the phase 2 objective function in phase 1. The default value is 0.
PRICE=m
species the number of columns to subset when multiple pricing is used in selecting the
column to enter the basis (Greenberg 1978). The type of suboptimization used is determined
by the PRICETYPE= option. See the section Pricing on page 215 for a description of this
process.
PRICETYPE=pricetype
species the type of multiple pricing to be performed. If this option is specied and the
PRICE= option is not specied, then PRICE= is assumed to be 10. Valid values for the
PRICETYPE= option are
v PRICETYPE=COMPLETE
PROC LP Statement ! 197
v PRICETYPE=DYNAMIC
v PRICETYPE=NONE
v PRICETYPE=PARTIAL
The default value is PARTIAL. See the section Pricing on page 215 for a description of this
process.
RANDOMPRICEMULT=r
species a number between 0 and 1. This option sets a limit, in phase 1, on the number of
iterations when PROC LP will randomly pick the entering variables. The limit equals r times
the number of nonbasic variables, or the number of basic variables, whichever is smaller. The
default value of the RANDOMPRICEMULT= option is 0.01.
REPSILON=e
species a positive number close to zero. The REPSILON= option is used in the ratio test to
determine which basic variable is to leave the basis. The default value is 1.0E10.
SCALE=scale
species the type of scaling to be used. Valid values for the SCALE= option are
v SCALE=BOTH
v SCALE=COLUMN
v SCALE=NONE
v SCALE=ROW
The default value is BOTH. See the section Scaling on page 216 for further discussion.
SMALL=e
species a positive number close to zero. Any element in a matrix with a value less than e is
set to zero. The default value is machine dependent.
TIME=t
checks at each iteration to see if t seconds have elapsed since PROC LP began. If more than
t seconds have elapsed, the procedure pauses and displays the current solution. The default
value is 120 seconds.
U=r
enables PROC LP to control the choice of pivots during LU decomposition and updating the
basis matrix. The variable r should take values between EPSILON and 1.0 because small
values of r bias the algorithm toward maintaining sparsity at the expense of numerical stability
and vice versa. The more sparse the decomposed basis is, the less time each iteration takes.
The default value is 0.1.
198 ! Chapter 5: The LP Procedure
COEF Statement
COEF variables ;
For the sparse input format, the COEF statement species the numeric variables in the problem
data set that contain the coefcients in the model. The value of the coefcient variable in a given
observation is the value of the coefcient in the column and row specied in the COLUMN and
ROW variables in that observation. For multiple ROW variables, the LP procedure maps the ROW
variables to the COEF variables on the basis of their order in the COEF and ROW statements. There
must be the same number of COEF variables as ROW variables. If the COEF statement is omitted,
the procedure looks for the default variable names that have the prex _COEF.
COL Statement
COL variable ;
For the sparse input format, the COL statement species a character variable in the problem data
set that contains the names of the columns in the model. Columns in the model are either structural
variables, right-hand-side vectors, right-hand-side change vectors, or a range vector. The COL
variable must be a character variable. If the COL statement is omitted, the LP procedure looks for
the default variable name _COL_.
ID Statement
ID variable(s) ;
For the dense input format, the ID statement species a character variable in the problem data set
that contains a name for each constraint coefcients row, objective coefcients row, and variable
denition row. If the ID statement is omitted, the LP procedure looks for the default variable name,
_ID_. If this variable is not in the problem data set, the procedure assigns the default name _OBS.._
to each row, where .. species the observation number in the problem data set.
For the sparse input format, the ID statement species the character variables in the problem data
set that contain the names of the rows in the model. Rows in the model are one of the following
types: constraints, objective functions, bounding rows, or variable describing rows. The ID variables
must be character variables. There must be the same number of ID variables as variables specied in
the COEF statement. If the ID statement is omitted, the LP procedure looks for the default variable
names having the prex _ID.
NOTE: The ID statement is an alias for the ROW statement.
IPIVOT Statement ! 199
IPIVOT Statement
IPIVOT ;
The IPIVOT statement causes the LP procedure to execute one integer branch-and-bound pivot and
pause. If you use the IPIVOT statement while the PROXIMITYPAUSE= option is in effect, pause
interferences may occur. To avoid such interferences, you must either reset the PROXIMITYPAUSE
value or submit IPIVOT; RUN; instead of IPIVOT;.
PIVOT Statement
PIVOT ;
The PIVOT statement causes the LP procedure to execute one simplex pivot and pause.
PRINT Statement
PRINT options ;
The PRINT statement is useful for displaying part of a solution summary, examining intermediate
tableaus, performing sensitivity analysis, and using parametric programming. In the options, the
colnames and rownames lists can be empty, in which case the LP procedure displays tables with all
columns or rows, or both. If a column name or a row name has spaces or other special characters in
it, the name must be enclosed in single or double quotes when it appears in the argument.
The options that can be used with this statement are as follows.
BEST
displays a Solution, Variable, and Constraint Summary for the best integer solution found.
COLUMN(colnames) / SENSITIVITY
displays a Variable Summary containing the logical and structural variables listed in the
colnames list. If the / SENSITIVITY option is included, then sensitivity analysis is performed
on the price coefcients for the listed colnames structural variables.
INTEGER
displays a Variable Summary containing only the integer variables.
INTEGER_NONZEROS
displays a Variable Summary containing only the integer variables with nonzero activity.
INTEGER_ZEROS
displays a Variable Summary containing only the integer variables with zero activity.
200 ! Chapter 5: The LP Procedure
MATRIX(rownames,colnames) / PICTURE
displays the submatrix of the matrix of constraint coefcients dened by the rownames and
colnames lists. If the / PICTURE option is included, then the formatted submatrix is displayed.
The format used is summarized in Table 5.2.
Table 5.2 Format Summary
Condition on the Coefcient x Symbols Printed
abs(x) = 0
0 < abs(x) < .000001 sgn(x) Z
.000001 _ abs(x) < .00001 sgn(x) Y
.00001 _ abs(x) < .0001 sgn(x) X
.0001 _ abs(x) < .001 sgn(x) W
.001 _ abs(x) < .01 sgn(x) V
.01 _ abs(x) < .1 sgn(x) U
.1 _ abs(x) < 1 sgn(x) T
abs(x) = 1 sgn(x) 1
1 < abs(x) < 10 sgn(x) A
10 _ abs(x) < 100 sgn(x) B
100 _ abs(x) < 1000 sgn(x) C
1000 _ abs(x) < 10000 sgn(x) D
10000 _ abs(x) < 100000 sgn(x) E
100000 _ abs(x) < 1.0E06 sgn(x) F
NONINTEGER
displays a Variable Summary containing only the continuous variables.
NONINTEGER_NONZEROS
displays a Variable Summary containing only the continuous variables with nonzero activity.
NONZEROS
displays a Variable Summary containing only the variables with nonzero activity.
PRICESEN
displays the results of parametric programming for the current value of the PRICEPHI= option,
the price coefcients, and all of the price change vectors.
RANGEPRICE
performs range analysis on the price coefcients.
RANGERHS
performs range analysis on the right-hand-side vector.
RHSSEN
displays the results of parametric programming for the current value of the RHSPHI= option,
the right-hand-side coefcients, and all of the right-hand-side change vectors.
QUIT Statement ! 201
ROW(rownames) / SENSITIVITY
displays a constraint summary containing the rows listed in the rowname list. If the / SEN-
SITIVITY option is included, then sensitivity analysis is performed on the right-hand-side
coefcients for the listed rownames.
SOLUTION
displays the Solution Summary, including the Variable Summary and the Constraint Summary.
TABLEAU
displays the current tableau.
ZEROS
displays a Variable Summary containing only the variables with zero activity. This may be
useful in the analysis of ON/OFF, ZERO/ONE, scheduling, and assignment applications.
QUIT Statement
QUIT options ;
The QUIT statement causes the LP procedure to terminate processing immediately. No further
displaying is performed and no output data sets are created.
The QUIT/SAVE statement causes the LP procedure to save the output data sets, dened in the
PROC LP statement or in the RESET statement, and then terminate the procedure.
RANGE Statement
RANGE variable ;
For the dense input format, the RANGE statement identies the variable in the problem data set that
contains the range coefcients. These coefcients enable you to specify the feasible range of a row.
For example, if the i th row is
a
T
. _ b
i
and the range coefcient for this row is r
i
> 0, then all values of . that satisfy
b
i
r
i
_ a
T
. _ b
i
are feasible for this row. Table 5.3 shows the bounds on a row as a function of the row type and the
sign on a nonmissing range coefcient r.
202 ! Chapter 5: The LP Procedure
Table 5.3 Interpretation of the Range Coefcient
Bounds
r _TYPE_ Lower Upper
= 0 LE b [r[ b
= 0 GE b b [r[
> 0 EQ b b r
< 0 EQ b r b
If you include a range variable in the model and have a missing value or zero for it in a constraint
row, then that constraint is treated as if no range variable had been included.
If the RANGE statement is omitted, the LP procedure assumes that the variable named _RANGE_
contains the range coefcients.
For the sparse input format, the RANGE statement gives the name of a column in the problem data
set that contains the range constants. If the RANGE statement is omitted, then the LP procedure
assumes that the column named _RANGE_ or the column with the RANGE keyword in the problem
data set contains the range constants.
RESET Statement
RESET options ;
The RESET statement is used to change options after the LP procedure has started execution.
All of the options that can be set in the PROC LP statement can also be reset with the RESET
statement, except for the DATA=, the PRIMALIN=, and the ACTIVEIN= options. In addition to the
options available with the PROC LP statement, the following two options can be used.
LOWER(colnames)=n;
During phase 3, this sets the lower bound on all of the structural variables listed in the colnames
list to an integer value n. This may contaminate the branch-and-bound tree. All nodes that
descend from the current problem have lower bounds that may be different from those input in
the problem data set.
UPPER(colnames)=n;
During phase 3, this sets the upper bound on all of the structural variables listed in the colnames
list to an integer value n. This may contaminate the branch-and-bound tree. All nodes that
descend from the current problem have upper bounds that may be different from those input in
the problem data set.
Note that the LOWER= and UPPER= options only apply to phase 3 for integer problems.
Therefore, they should only be applied once the integer iterations have started; if they are
applied before then, they will be ignored.
RHS Statement ! 203
RHS Statement
RHS variables ;
For the dense input format, the RHS statement identies variables in the problem data set that contain
the right-hand-side constants of the linear program. Only numeric variables can be specied. If more
than one variable is included in the RHS statement, the LP procedure assumes that problems for
several linear programs are dened in the problem data set. A new linear program is dened for
each variable in the RHS list. If the RHS statement is omitted, the procedure assumes that a variable
named _RHS_ contains the right-hand-side constants.
For the sparse input format, the RHS statement gives the names of one or more columns in the
problem data set that are to be considered as right-hand-side constants. If the RHS statement is
omitted, then the LP procedure assumes that the column named _RHS_ or columns with the RHS
keyword in the problem data set contain the right-hand-side constants. See the section Sparse Data
Input Format on page 208 for further discussion.
As default, the LP procedure assumes that the RHS constant is a zero vector for the dense and sparse
input formats.
RHSSEN Statement
RHSSEN variables ;
For the dense input format, the RHSSEN statement identies variables in the problem data set that
dene change vectors for examining the sensitivity of the optimal solution to changes in the RHS
constants. If the RHSSEN statement is omitted, then the LP procedure assumes that a variable named
_RHSSEN_ contains a right-hand-side change vector.
For the sparse input format, the RHSSEN statement gives the names of one or more columns in the
problem data set that are to be considered as change vectors. If the RHSSEN statement is omitted,
then the LP procedure assumes that the column named _RHSSEN_ or columns with the RHSSEN
keyword in the problem data set contain the right-hand-side change vectors. For further information,
see the section Sparse Data Input Format on page 208, the section Right-Hand-Side Sensitivity
Analysis on page 226, and the section Right-Hand-Side Parametric Programming on page 228.
ROW Statement
ROW variable(s) ;
For the dense input format, the ROW statement species a character variable in the problem data
set that contains a name for each row of constraint coefcients, each row of objective coefcients
and each variable describing row. If the ROW statement is omitted, the LP procedure looks for the
default variable name, _ROW_. If there is no such variable in the problem data set, the procedure
204 ! Chapter 5: The LP Procedure
assigns the default name _OBS.._ to each row, where .. species the observation number in the
problem data set.
For the sparse input format, the ROW statement species the character variables in the problem data
set that contain the names of the rows in the model. Rows in the model are one of the following types:
constraints, objective functions, bounding rows, or variable describing rows. The ROW variables
must be character variables. There must be the same number of ROW variables as variables specied
in the COEF statement. If the ROW statement is omitted, the LP procedure looks for the default
variable names having the prex _ROW.
RUN Statement
RUN ;
The RUN statement causes optimization to be started or resumed.
The TITLE or OPTIONS statement should not appear between PROC LP and RUN statements.
SHOW Statement
SHOW options ;
The SHOW statement species that the LP procedure display either the current options or the current
solution status on the SAS log.
OPTIONS
requests that the current options be displayed on the SAS log.
STATUS
requests that the status of the current solution be displayed on the SAS log.
TYPE Statement
TYPE variable ;
The TYPE statement species a character variable in the problem data set that contains the type
identier for each observation. This variable has keyword values that specify how the LP procedure
should interpret the observation. If the TYPE statement is omitted, the procedure assumes that a
variable named _TYPE_ contains the type keywords.
For the dense input format, the type variable identies the constraint and objective rows and rows
that contain information about the variables. The type variable should have nonmissing values in all
observations.
TYPE Statement ! 205
For the sparse input format, the type variable identies a models rows and columns. In an observation,
a nonmissing type is associated with either a row or a column. If there are many columns sharing the
same type, you can dene a row of that type. Then, any nonmissing values in that row set the types
of the corresponding columns.
The following are valid values for the TYPE variable in an observation:
MIN contains the price coefcients of an objective row, for example, c
in the problem (MIP), to be minimized.
MAX contains the price coefcients of an objective row, for example, c,
to be maximized.
EQ (=) contains coefcients of an equality constrained row.
LE (_) contains coefcients of an inequality, less than or equal to, con-
strained row.
GE (_) contains coefcients of an inequality, greater than or equal to,
constrained row.
SOSEQ identies the rowas specifying a special ordered set. The variables
agged in this row are members of a set exactly one of which
must be above its lower bound in the optimal solution. Note that
variables in this type of special ordered set must be integer.
SOSLE identies the rowas specifying a special ordered set. The variables
agged in this row are members of a set in which only one can be
above its lower bound in the optimal solution.
UNRSTRT
UNRSTRCT
identies those structural variables to be considered as unre-
stricted variables. These are variables for which
i
= o
and u
i
= o. Any variable that has a 1 in this observation
is considered an unrestricted variable.
LOWERBD identies lower bounds on the structural variables. If all structural
variables are to be nonnegative, that is,
i
= 0, then you do not
need to include an observation with the LOWERBD keyword
in a variable specied in the TYPE statement. Missing values
for variables in a lower-bound row indicate that the variable has
lower bound equal to zero.
NOTE: A variable with lower or upper bounds cannot be identi-
ed as unrestricted.
UPPERBD identies upper bounds u
i
on the structural variables. For each
structural variable that is to have an upper bound u
i
= o, the
observation must contain a missing value or the current value
of INFINITY. All other values are interpreted as upper bounds,
including 0.
FIXED identies variables that have xed values. A nonmissing value in
a row with FIXED type keyword gives the constant value of that
variable.
206 ! Chapter 5: The LP Procedure
INTEGER identies variables that are integer-constrained. In a feasible
solution, these variables must have integer values. A missing
value in a row with INTEGER type keyword indicates that the
variable is not integer-constrained. The value of variables in
the INTEGER row gives an ordering to the integer-constrained
variables that is used when the VARSELECT= option equals
PRIOR.
NOTE: Every integer-constrained variable must have an upper
bound dened in a row with type UPPERBD. See the section
Controlling the Branch-and-Bound Search on page 220 for fur-
ther discussion.
BINARY identies variables that are constrained to be either 0 or 1. This is
equivalent to specifying that the variable is an integer variable and
has a lower bound of 0 and an upper bound of 1. A missing value
in a row with BINARY type keyword indicates that the variable
is not constrained to be 0 or 1. The value of variables in the
BINARY row gives an ordering to the integer-constrained vari-
ables that is used when the VARSELECT= option equals PRIOR.
See the section Controlling the Branch-and-Bound Search on
page 220 for further discussion.
BASIC identies variables that form an initial basic feasible solution.
A missing value in a row with BASIC type indicates that the
variable is not basic.
PRICESEN identies a vector that is used to evaluate the sensitivity of the op-
timal solution to changes in the objective function. See the section
Price Sensitivity Analysis on page 227 and the section Price
Parametric Programming on page 229 for further discussion.
FREE identies a nonbinding constraint. Any number of FREE con-
straints can appear in a problem data set.
RHS identies a right-hand-side column in the sparse input format.
This replaces the RHS statement. It is useful when converting the
MPS format into the sparse format of PROC LP. See the section
Converting Standard MPS Format to Sparse Format on page 210
for more information.
RHSSEN identies a right-hand-side sensitivity analysis vector in the sparse
input format. This replaces the RHSSEN statement. It is useful
when converting the MPS format into the sparse format of PROC
LP. See the section Converting Standard MPS Format to Sparse
Format on page 210 for more information.
RANGE identies a range vector in the sparse input format. This replaces
the RANGE statement. It is useful when converting the MPS
format into the sparse format of PROC LP. See the section Con-
verting Standard MPS Format to Sparse Format on page 210 for
more information.
VAR Statement ! 207
VAR Statement
VAR variables ;
For the dense input format, the VAR statement identies variables in the problem data set that are to
be interpreted as structural variables in the linear program. Only numeric variables can be specied.
If no VAR statement is specied, the LP procedure uses all numeric variables not included in an
RHS or RHSSEN statement as structural variables.
Details: LP Procedure
Missing Values
The LP procedure treats missing values as missing in all rows except those that identify either upper
or lower bounds on structural variables. If the row is an upper-bound row, then the type identier is
UPPERBD and the LP procedure treats missing values as o. If the row is a lower-bound row,
then the type identier is LOWERBD and the LP procedure treats missing values as 0, except for
the variables that are identied as UNRSTRT.
Dense Data Input Format
In the dense format, a model is expressed in a similar way as it is formulated. Each SAS variable
corresponds to a models column and each SAS observation corresponds to a models row. A SAS
variable in the input data set is one of the following:
v a type variable
v an id variable
v a structural variable
v a right-hand-side variable
v a right-hand-side sensitivity analysis variable
v a range variable
The type variable tells PROC LP how to interpret the observation as a part of the mathematical
programming problem. It identies and classies objectives, constraints, and the rows that contain
information of variables like types, bounds, and so on. PROC LP recognizes the following keywords
208 ! Chapter 5: The LP Procedure
as values for the type variable: MIN, MAX, EQ, LE, GE, SOSEQ, SOSLE, UNRSTRT, LOWERBD,
UPPERBD, FIXED, INTEGER, BINARY, BASIC, PRICESEN, and FREE. The values of the id
variable are the names of the rows in the model. The other variables identify and classify the columns
with numerical values.
The TYPE, ID (or ROW), and RHS statements can be omitted if the input data set contains variables
_TYPE_, _ID_ (or _ROW_), and _RHS_; otherwise, they must be used. The VAR statement is optional.
When it is not specied, PROC LP uses as structural variables all numeric variables not explicitly or
implicitly included in statement lists. The RHSSEN and RANGE statements are optional statements
for sensitivity and range analyses. They can be omitted if the input data set contains the _RHSSEN_
and _RANGE_ variables.
Sparse Data Input Format
The sparse format to PROC LP is designed to enable you to specify only the nonzero coefcients in
the description of linear programs, integer programs, and mixed-integer programs. The SAS data set
that describes the sparse model must contain at least four SAS variables:
v a type variable
v a column variable
v a row variable
v a coefcient variable
Each observation in the data set associates a type with a row or column, and denes a coefcient or
numerical value in the model. The value of the type variable is a keyword that tells PROC LP how to
interpret the observation. In addition to the keywords in the dense format, PROC LP also recognizes
the keywords RHS, RHSSEN, and RANGE as values of the type variable. Table 5.5 shows the
keywords that are recognized by PROC LP and in which variables can appear in the problem data set.
The values of the row and column variables are the names of the rows and columns in the model.
The values of the coefcient variables dene basic coefcients and lower and upper bounds, and
identify model variables with types BASIC, FIXED, BINARY, and INTEGER. All character values
in the sparse data input format are case insensitive.
The SAS data set can contain multiple pairs of rows and coefcient variables. In this way, more
information about the model can be specied in each observation in the data set. See Example 5.2
for details.
Sparse Data Input Format ! 209
Table 5.5 Variable Keywords Used in the Problem Data Set
TYPE (_TYPE_) COL (_COL_)
MIN
MAX
EQ
LE
GE
SOSEQ
SOSLE
UNRSTRT
LOWERBD
UPPERBD
FIXED
INTEGER
BINARY
BASIC
PRICESEN
FREE
RHS _RHS_
RHSSEN _RHSSEN_
RANGE _RANGE_
+xxxxxxx
Follow these rules for sparse data input:
v The order of the observations is unimportant.
v Each unique column name appearing in the COL variable denes a unique column in the
model.
v Each unique row name appearing in the ROW variable denes a unique row in the model.
v The type of the row is identied when an observation in which the row name appears (in a
ROW variable) has type MIN, MAX, LE, GE, EQ, SOSLE, SOSEQ, LOWERBD, UPPERBD,
UNRSTRT, FIXED, BINARY, INTEGER, BASIC, FREE, or PRICESEN.
v The type of each row must be identied at least once. If a row is given a type more than once,
the multiple denitions must be identical.
v When there are multiple rows named in an observation (that is, when there are multiple ROW
variables), the TYPE variable applies to each row named in the observation.
v The type of a column is identied when an observation in which the column name but no row
name appears has the type LOWERBD, UPPERBD, UNRSTRT, FIXED, BINARY, INTEGER,
BASIC, RHS, RHSSEN, or RANGE. A column type can also be identied in an observation
in which both column and row names appear and the row name has one of the preceding types.
210 ! Chapter 5: The LP Procedure
v Each column is assumed to be a structural column in the model unless the column is identied
as a right-hand-side vector, a right-hand-side change vector, or a range vector. A column can
be identied as one of these types using either the keywords RHS, RHSSEN, or RANGE in
the TYPE variable, the special column names _RHS_, _RHSSEN_, or _RANGE_, or the RHS,
RHSSEN, or RANGE statements following the PROC LP statement.
v A TYPE variable beginning with the character + causes the observation to be interpreted as a
comment.
When the column names appear in the Variable Summary in the PROC LP output, they are listed in
alphabetical order. The row names appear in the order in which they appear in the problem data set.
Converting Any PROC LP Format to an MPS-Format SAS Data Set
The MPSOUT= option enables you to convert an input data set for the LP procedure into an MPS-
format SAS data set. The converted data set is readable by the OPTLP and OPTMILP procedures.
The conversion can handle both linear and mixed integer linear programs. The _TYPE_ values for
sensitivity analysis (PRICESEN), parametric programming (RHSSEN), and input basis (BASIS) are
dropped. When multiple objective rows are present, only the rst row is marked as the objective
row. The remaining rows are marked as free rows. When multiple right-hand side (RHS) columns
are present, only the rst RHS column is processed. Constraints with a _TYPE_ value of SOSEQ or
SOSLE are ignored. The MPSOUT= option does not output branching priorities specied for the
VARSELECT=PRIOR option to a BRANCH section in the MPS-format SAS data set.
For information about how the contents of the MPS-format SAS data set are interpreted, see
Chapter 16, The MPS-Format SAS Data Set.
Converting Standard MPS Format to Sparse Format
The MPS input format was introduced by IBM as a way of specifying data for linear and integer
programs. Before you can solve a linear program specied in the MPS input format by using the LP
procedure, the data must be converted to the sparse format of the LP procedure. If you want to solve
a linear program specied in the sparse LP format by using the OPTLP procedure, you must convert
the data into an MPS-format SAS data set. This section describes how to perform both conversions.
SASMPSXS is a SAS macro function that converts the standard MPS format to the sparse format of
the LP procedure. The following is an example of the MPS format:
NAME EXAMPLE
*
THIS IS DATA FOR THE PRODUCT MIX PROBLEM.
ROWS
N PROFIT
L STAMP
Converting Standard MPS Format to Sparse Format ! 211
L ASSEMB
L FINISH
N CHNROW
N PRICE
COLUMNS
DESK STAMP 3.00000 ASSEMB 10.00000
DESK FINISH 10.00000 PROFIT 95.00000
DESK PRICE 175.00000
CHAIR STAMP 1.50000 ASSEMB 6.00000
CHAIR FINISH 8.00000 PROFIT 41.00000
CHAIR PRICE 95.00000
CABINET STAMP 2.00000 ASSEMB 8.00000
CABINET FINISH 8.00000 PROFIT 84.00000
CABINET PRICE 145.00000
BOOKCSE STAMP 2.00000 ASSEMB 7.00000
BOOKCSE FINISH 7.00000 PROFIT 76.00000
BOOKCSE PRICE 130.00000 CHNROW 1.00000
RHS
TIME STAMP 800.00000 ASSEMB 1200.0000
TIME FINISH 800.00000
RANGES
T1 ASSEMB 900.00000
BOUNDS
UP CHAIR 75.00000
LO BOOKCSE 50.00000
ENDATA
In this example, the company tries to nd an optimal product mix of four items: a DESK, a CHAIR,
a CABINET, and a BOOKCASE. Each item is processed in a stamping department (STAMP), an
assembly department (ASSEMB), and a nishing department (FINISH). The time each item requires
in each department is given in the input data. Because of resource limitations, each department has
an upper limit on the time available for processing. Furthermore, because of labor constraints, the
assembly department must work at least 300 hours. Finally, marketing tells you not to make more
than 75 chairs, to make at least 50 bookcases, and to nd the range over which the selling price of a
bookcase can vary without changing the optimal product mix.
The SASMPSXS macro function uses MPSFILE=FILENAME as an argument to read an MPS
input le. It then converts the le and saves the conversion to a default SAS data set, PROB. The
FILENAME should include the path.
Running the following statements on the preceding example
%sasmpsxs(mpsfile='filename');
proc print data=prob;
run;
produces the sparse input form of the LP procedure:
212 ! Chapter 5: The LP Procedure
OBS _TYPE_ _COL_ _ROW1_ _COEF1_ _ROW2_ _COEF2_
1
*
OW . .
2 FREE PROFIT . .
3 LE STAMP . .
4 LE ASSEMB . .
5 LE FINISH . .
6 FREE CHNROW . .
7 FREE PRICE . .
8
*
OL MNS . .
9 DESK STAMP 3.0 ASSEMB 10
10 DESK FINISH 10.0 PROFIT 95
11 DESK PRICE 175.0 .
12 CHAIR STAMP 1.5 ASSEMB 6
13 CHAIR FINISH 8.0 PROFIT 41
14 CHAIR PRICE 95.0 .
15 CABINET STAMP 2.0 ASSEMB 8
16 CABINET FINISH 8.0 PROFIT 84
17 CABINET PRICE 145.0 .
18 BOOKCSE STAMP 2 ASSEMB 7
19 BOOKCSE FINISH 7 PROFIT 76
20 BOOKCSE PRICE 130 CHNROW 1
21
*
HS . .
22 RHS TIME STAMP 800 ASSEMB 1200
23 RHS TIME FINISH 800 .
24
*
AN ES . .
25 RANGE T1 ASSEMB 900 .
26
*
OU DS . .
27 UPPERBDD CHAIR UP 75 .
28 LOWERBDD BOOKCSE LO 50 .
SASMPSXS recognizes four MPS row types: E, L, G, and N. It converts them into types EQ, LE,
GE, and FREE. Since objective rows, price change rows and free rows all share the same type N in
the MPS format, you need a DATA step to assign proper types to the objective rows and price change
rows.
data;
set prob;
if _type_='free' and _row1_='profit' then _type_='max';
if _type_='free' and _row1_='chnrow' then _type_='pricesen';
run;
proc lp sparsedata;
run;
In the MPS format, the variable types include LO, UP, FX, FR, MI, and BV. The SASMPSXS macro
converts them into types LOWERBD, UPPERBD, FIXED, UNRESTRICTED, -INFINITY, and
BINARY, respectively. Occasionally, you may need to dene your own variable types, in which case,
you must add corresponding type handling entries in the SASMPSXS.SAS program and use the SAS
%INCLUDE macro to include the le at the beginning of your program. The SASMPSXS macro
The Reduced Costs, Dual Activities, and Current Tableau ! 213
function can be found in the SAS sample library. Information on the MPS format can be obtained
from Murtagh (1981).
SASMPSXS can take no arguments, or it can take one or two arguments. If no arguments are
present, SASMPSXS assumes that the MPS input le has been saved to a SAS data set named RAW.
The macro then takes information from that data set and converts it into the sparse form of the LP
procedure. The RAW data set should have the following six variables:
data RAW;
infile ...;
input field1 $ 2-3 field2 $ 5-12
field3 $ 15-22 field4 25-36
field5 $ 40-47 field6 50-61;
...
run;
If the preceding MPS input data set has a name other than RAW, you can use MPSDATA=SAS-data-
set as an argument in the SASMPSXS macro function. If you want the converted sparse form data
set to have a name other than PROB, you can use LPDATA=SAS-data-set as an argument. The order
of the arguments in the SASMPSXS macro function is not important.
The Reduced Costs, Dual Activities, and Current Tableau
The evaluation of reduced costs and the dual activities is independent of problem structure. For a
basic solution, let T be the matrix composed of the basic columns of and let N be the matrix
composed of the nonbasic columns of . The reduced cost associated with the i th variable is
(c
T
c
T
B
T
-1
)
i
and the dual activity of the th row is
(c
T
B
T
-1
)
}
The Current Tableau is a section displayed when you specify either the TABLEAUPRINT option in
the PROC LP statement or the TABLEAU option in the PRINT statement. The output contains a row
for each basic variable and a column for each nonbasic variable. In addition, there is a row for the
reduced costs and a column for the product
T
-1
b
This column is labeled INV(B)*R. The body of the tableau contains the matrix
T
-1
N
214 ! Chapter 5: The LP Procedure
Macro Variable _ORLP_
The LP procedure denes a macro variable named _ORLP_. This variable contains a character string
that indicates the status of the procedure. It is set whenever the user gets control, at breakpoints, and
at procedure termination. The form of the _ORLP_ character string is STATUS= PHASE= OBJEC-
TIVE= P_FEAS= D_FEAS= INT_ITER= INT_FEAS= ACTIVE= INT_BEST= PHASE1_ITER=
PHASE2_ITER= PHASE3_ITER=. The terms are interpreted as follows:
STATUS= the status of the current solution
PHASE= the phase the procedure is in (1, 2, or 3)
OBJECTIVE= the current objective value
P_FEAS= whether the current solution is primal feasible
D_FEAS= whether the current solution is dual feasible
INT_ITER= the number of integer iterations performed
INT_FEAS= the number of integer feasible solutions found
ACTIVE= the number of active nodes in the current branch-and-bound
tree
INT_BEST= the best integer objective value found
PHASE1_ITER= the number of iterations performed in phase 1
PHASE2_ITER= the number of iterations performed in phase 2
PHASE3_ITER= the number of iterations performed in phase 3
Table 5.7 shows the possible values for the nonnumeric terms in the string.
Pricing ! 215
Table 5.7 Possible Values for Nonnumeric Terms
STATUS= P_FEAS= D_FEAS=
SUCCESSFUL YES YES
UNBOUNDED NO NO
INFEASIBLE
MAX_TIME
MAX_ITER
PIVOT
BREAK
INT_FEASIBLE
INT_INFEASIBLE
INT_MAX_ITER
PAUSE
FEASIBLEPAUSE
IPAUSE
PROXIMITYPAUSE
ACTIVE
RELAXED
FATHOMED
IPIVOT
UNSTABLE
SINGULAR
MEMORY_ERROR
IO_ERROR
SYNTAX_ERROR
SEMANTIC_ERROR
BADDATA_ERROR
UNKNOWN_ERROR
This information can be used when PROC LP is one step in a larger program that needs to identify
how the LP procedure terminated. Because _ORLP_ is a standard SAS macro variable, it can be
used in the ways that all macro variables can be used (see the SAS Guide to Macro Processing).
Pricing
PROC LP performs multiple pricing when determining which variable will enter the basis at each
pivot (Greenberg 1978). This heuristic can shorten execution time in many problems. The specics
of the multiple pricing algorithm depend on the value of the PRICETYPE= option. However, in
general, when some form of multiple pricing is used, during the rst iteration PROC LP places the
PRICE= nonbasic columns yielding the greatest marginal improvement to the objective function in a
candidate list. This list identies a subproblem of the original. On subsequent iterations, only the
reduced costs for the nonbasic variables in the candidate list are calculated. This accounts for the
potential time savings. When either the candidate list is empty or the subproblem is optimal, a new
candidate list must be identied and the process repeats. Because identication of the subproblem
216 ! Chapter 5: The LP Procedure
requires pricing the complete problem, an iteration in which this occurs is called a major iteration. A
minor iteration is an iteration in which only the subproblem is to be priced.
The value of the PRICETYPE= option determines the type of multiple pricing that is to be used.
The types of multiple pricing include partial suboptimization (PRICETYPE=PARTIAL), complete
suboptimization (PRICETYPE=COMPLETE), and complete suboptimization with dynamically
varying the value of the PRICE= option (PRICETYPE=DYNAMIC).
When partial suboptimization is used, in each minor iteration the nonbasic column in the subproblem
yielding the greatest marginal improvement to the objective is brought into the basis and removed
from the candidate list. The candidate list now has one less entry. At each subsequent iteration,
another column from the subproblem is brought into the basis and removed from the candidate
list. When there are either no remaining candidates or the remaining candidates do not improve the
objective, the subproblem is abandoned and a major iteration is performed. If the objective cannot be
improved on a major iteration, the current solution is optimal and PROC LP terminates.
Complete suboptimization is identical to partial suboptimization with one exception. When a
nonbasic column from the subproblem is brought into the basis, it is replaced in the candidate list by
the basic column that is leaving the basis. As a result, the candidate list does not diminish at each
iteration.
When PRICETYPE=DYNAMIC, complete suboptimization is performed, but the value of the
PRICE= option changes so that the ratio of minor to major iterations is within two units of the
PRICE= option.
These heuristics can shorten execution time for small values of the PRICE= option. Care should be
exercised in choosing a value from the PRICE= option because too large a value can use more time
than if pricing were not used.
Scaling
Based on the SCALE= option specied, the procedure scales the coefcients of both the constraints
and objective rows before iterating. This technique can improve the numerical stability of an ill-
conditioned problem. If you want to modify the default matrix scaling used, which is SCALE=BOTH,
use the SCALE=COLUMN, SCALE=ROW, or SCALE=NONE option in the PROC LP statement. If
SCALE=BOTH, the matrix coefcients are scaled so that the largest element in absolute value in each
row or column equals 1. They are scaled by columns rst and then by rows. If SCALE=COLUMN
(ROW), the matrix coefcients are scaled so that the largest element in absolute value in each column
(row) equals 1. If SCALE=NONE, no scaling is performed.
Preprocessing
With the preprocessing option, you can identify redundant and infeasible constraints, improve lower
and upper bounds of variables, x variable values and improve coefcients and RHS values before
solving a problem. Preprocessing can be applied to LP, IP and MIP problems. For an LP problem, it
Integer Programming ! 217
may signicantly reduce the problem size. For an IP or MIP problem, it can often reduce the gap
between the optimal solution and the solution of the relaxed problem, which could lead to a smaller
search tree in the branch-and-bound algorithm. As a result, the CPU time may be reduced on many
problems. Although there is no guarantee that preprocessing will always yield a faster solution, it
does provide a highly effective approach to solving large and difcult problems.
Preprocessing is especially useful when the original problem causes numerical difculties to PROC
LP. Since preprocessing could identify redundant constraints and tighten lower and upper bounds of
variables, the reformulated problem may eliminate the numerical difculties in practice.
When a constraint is identied as redundant, its type is marked as FREE in the Constraint Summary.
If a variable is xed, its type is marked as FIXED in the Variables Summary. If a constraint is
identied as infeasible, PROC LP stops immediately and displays the constraint name in the SAS
log le. This capability sometimes gives valuable insight into the model or the formulation and helps
establish if the model is reasonable and the formulation is correct.
For a large and dense problem, preprocessing may take a longer time for each iteration. To limit the
number of preprocessings, use the PMAXIT= option. To stop any further preprocessings during the
preprocessing stage, press the CTRL-BREAK key. PROC LP will enter phase 1 at the end of the
current iteration.
Integer Programming
Formulations of mathematical programs often require that some of the decision variables take only
integer values. Consider the formulation
minimize c
T
.
subject to . {_. =. _] b
_ . _ u
.
i
is integer. i S
The set of indices S identies those variables that must take only integer values. When S does not
contain all of the integers between 1 and n, inclusive, this problem is called a mixed-integer program
(MIP). Otherwise, it is known as an integer program. Let .
o]t
(MIP) denote an optimal solution to
(MIP). An integer variable with bounds between 0 and 1 is also called a binary variable.
Specifying the Problem
An integer or mixed-integer problem can be solved with PROC LP. To solve this problem, you must
identify the integer variables. You can do this with a row in the input data set that has the keyword
INTEGER for the type variable. Any variable that has a nonmissing and nonzero value for this
row is interpreted as an integer variable. It is important to note that integer variables must have
upper bounds explicitly dened using the UPPERBD keyword. The values in the INTEGER row
not only identify those variables that must be integers, but they also give an ordering to the integer
variables that can be used in the solution technique.
218 ! Chapter 5: The LP Procedure
You can follow the same steps to identify binary variables. For the binary variables, there is no need
to supply any upper bounds.
Following the rules of sparse data input format, you can also identify individual integer or binary
variables.
The Branch-and-Bound Technique
The branch-and-bound approach is used to solve integer and mixed-integer problems. The following
discussion outlines the approach and explains how to use several options to control the procedure.
The branch-and-bound technique solves an integer program by solving a sequence of linear programs.
The sequence can be represented by a tree, with each node in the tree being identied with a linear
program that is derived from the problems on the path leading to the root of the tree. The root of the
tree is identied with a linear program that is identical to (MIP), except that S is empty. This relaxed
version of (MIP), called (LP(0)), can be written as
.
o]t
(0) = min c
T
.
subject to . {_. =. _] b
_ . _ u
The branch-and-bound approach generates linear programs along the nodes of the tree using the
following scheme. Consider .
o]t
(0), the optimal solution to (LP(0)). If .
o]t
(0)
i
is integer for all
i S, then .
o]t
(0) is optimal in (MIP). Suppose for some i S, .
o]t
(0)
i
is nonintegral. In that
case, dene two new problems (LP(1)) and (LP(2)), descendants of the parent problem (LP(0)). The
problem (LP(1)) is identical to (LP(0)) except for the additional constraint
.
i
_ ].
o]t
(0)
i

and the problem (LP(2)) is identical to (LP(0)) except for the additional constraint
.
i
_ {.
o]t
(0)
i

The notation {, means the smallest integer greater than or equal to ,, and the notation ], means
the largest integer less than or equal to ,. Note that the two new problems do not have .
o]t
(0) as a
feasible solution, but because the solution to (MIP) must satisfy one of the preceding constraints,
.
o]t
i
(MIP) must satisfy one of the new constraints. The two problems thus dened are called active
nodes in the branch-and-bound tree, and the variable .
i
is called the branching variable.
Next, the algorithm chooses one of the problems associated with an active node and attempts to
solve it using the dual simplex algorithm. The problem may be infeasible, in which case the problem
is dropped. If it can be solved, and it in turn does not have an integer solution (that is, a solution
for which .
i
is integer for all i S), then it denes two new problems. These new problems each
contain all of the constraints of the parent problems plus the appropriate additional one.
Branching continues in this manner until either there are no active nodes or an integer solution is
found. When an integer solution is found, its objective value provides a bound for the objective of
Integer Programming ! 219
(MIP). In particular, if : is the objective value of the current best integer solution, then any active
problems whose parent problem has objective value _ : can be discarded (assuming that the problem
is a minimization). This can be done because all problems that descend from this parent will also
have objective value _ :. This technique is known as fathoming. When there are no active nodes
remaining to be solved, the current integer solution is optimal in (MIP). If no integer solution has
been found, then (MIP) is (integer) infeasible.
It is important to realize that integer programs are NP-complete. Roughly speaking, this means that
the effort required to solve them grows exponentially with the size of the problem. For example, a
problem with 10 binary variables can, in the worst case, generate 2
10
= 1024 nodes in the branch-
and-bound tree. A problem with 20 binary variables can, in the worst case, generate 2
20
= 1048576
nodes in the branch-and-bound tree. Although the algorithm is unlikely to have to generate every
single possible node, the need to explore even a small fraction of the potential number of nodes for a
large problem can be resource intensive.
The Integer Iteration Log
To help monitor the growth of the branch-and-bound tree, the LP procedure reports on the status
of each problem that is solved. The report, displayed in the Integer Iteration Log, can be used to
reconstruct the branch-and-bound tree. Each row in the report describes the results of the attempted
solution of the linear program at a node in the tree. In the following discussion, a problem on a given
line in the log is called the current problem. The following columns are displayed in the report:
Iter identies the number of the branch-and-bound iteration.
Problem identies how the current problem ts in the branch-and-
bound tree.
Condition reports the result of the attempted solution of the current
problem. Values for Condition are:
v ACTIVE: The current problem was solved successfully.
v INFEASIBLE: The current problem is infeasible.
v FATHOMED: The current problem cannot lead to an
improved integer solution and therefore it is dropped.
v SINGULAR: A singular basis was encountered in at-
tempting to solve the current problem. Solution of this
relaxed problem is suspended and will be attempted
later if necessary.
v SUBOPTIMAL: The current problem has an integer
feasible solution.
Objective reports the objective value of the current problem.
Branched names the variable that is branched in subtrees dened by the
descendants of this problem.
220 ! Chapter 5: The LP Procedure
Value gives the current value of the variable named in the column
labeled Branched.
Sinfeas gives the sum of the integer infeasibilities in the optimal
solution to the current problem.
Active reports the total number of nodes currently active in the
branch-and-bound tree.
Proximity reports the gap between the best integer solution and the
current lower (upper for maximizations) bound of all active
nodes.
To reconstruct the branch-and-bound tree from this report, consider the interpretation of iteration .
If Iter= and Problem=k, then the problem solved on iteration is identical to the problem solved
on iteration [ k [ with an additional constraint. If k > 0, then the constraint is an upper bound on
the variable named in the Branched column on iteration [ k [. If k < 0, then the constraint is a lower
bound on that variable. The value of the bound can be obtained from the value of Value in iteration
[ k [ as described in the previous section.
Example 5.8 in the section Examples: LP Procedure on page 241 shows an Integer Iteration Log in
its output.
Controlling the Branch-and-Bound Search
There are several options you can use to control branching. This is accomplished by controlling the
programs choice of the branching variable and of the next active node. In the discussion that follows,
let

i
(k) = .
o]t
(k)
i
].
o]t
(k)
i

where .
o]t
(k) is the optimal solution to the problem solved in iteration k.
The CANSELECT= option directs the choice of the next active node. Valid keywords for this option
include LIFO, FIFO, OBJ, PROJECT, PSEUDOC, and ERROR. The following list describes the
action that each of these causes when the procedure must choose for solution a problem from the list
of active nodes.
LIFO chooses the last problem added to the tree of active nodes. This search has the
effect of a depth-rst search of the branch-and-bound tree.
FIFO chooses the rst node added to the tree of active nodes. This search has the effect
of a breadth-rst search of the branch-and-bound tree.
OBJ chooses the problem whose parent has the smallest (largest if the problem is a
maximization) objective value.
PROJECT chooses the problem with the largest (smallest if the problem is a maximization)
projected objective value. The projected objective value is evaluated using the sum
of integer infeasibilities, s(k), associated with an active node (LP(k)), dened by
Integer Programming ! 221
s(k) =

iS
min{
i
(k). 1
i
(k)]
An empirical measure of the rate of increase (decrease) in the objective value is
dened as
z = (:
+
:(0)),s(0)
where
v :(k) is the optimal objective value for (LP(k))
v :
+
is the objective value of the current best integer solution
The projected objective value for problems (LP(k 1)) and (LP(k 2)) is dened as
:(k) zs(k)
PSEUDOC chooses the problem with the largest (least if the problem is a maximization)
projected pseudocost) The projected pseudocost is evaluated using the weighted
sum of infeasibilities s
u
(k) associated with an active problem (LP(k)), dened
by
s
u
(k) =

iS
min{J
i
(k)
i
(k). u
i
(k)(1
i
(k))]
The weights u
i
and J
i
are initially equal to the absolute value of the i th objective
coefcient and are updated at each integer iteration. They are modied by
examining the empirical marginal change in the objective as additional constraints
are placed on the variables in S along the path from (LP(0)) to a node associated
with an integer feasible solution. In particular, if the denition of problems
(LP(k1)) and (LP(k2)) from parent (LP(k)) involve the addition of constraints
.
i
_ ].
o]t
(k)
i
and .
i
_ {.
o]t
(k)
i
, respectively, and one of them is on a path
to an integer feasible solution, then only one of the following is true:
J
i
(k) = (:(k 1) :(k)),
i
(k)
u
i
(k) = (:(k 2) :(k)),(1
i
(k))
222 ! Chapter 5: The LP Procedure
Note the similarity between s
u
(k) and s(k). The weighted quantity s
u
(k) ac-
counts to some extent for the inuence of the objective function. The projected
pseudocost for problems (LP(k 1)) and (LP(k 2)) is dened as
:
u
(k) :(k) s
u
(k)
ERROR chooses the problem with the largest error. The error associated with problems
(LP(k 1)) and (LP(k 2)) is dened as
(:
+
:
u
(k)),(:
+
:(k))
The BACKTRACK= option controls the search for the next problem. This option can take the same
values as the CANSELECT= option. In addition to the case outlined under the DELTAIT= option,
backtracking is required as follows based on the CANSELECT= option in effect:
v If CANSELECT=LIFO and there is no active node in the portion of the active tree currently
under exploration with a bound better than the value of WOBJECTIVE=, then the procedure
must backtrack.
v If CANSELECT=FIFO, PROJECT, PSEUDOC, or ERROR and the bound corresponding
to the node under consideration is not better than the value of WOBJECTIVE=, then the
procedure must backtrack.
The default value is OBJ.
The VARSELECT= option directs the choice of branching variable. Valid keywords for this option
include CLOSE, FAR, PRIOR, PSEUDOC, PRICE, and PENALTY. The following list describes the
action that each of these causes when .
o]t
(k), an optimal solution of problem (LP(k)), is used to
dene active problems (LP(k 1)) and (LP(k 2)).
CLOSE chooses as branching variable the variable .
i
such that i minimizes
{min{
i
(k). 1
i
(k)] [ i S anJ
IEPSILON _
i
(k) _ 1 IEPSILON]
FAR chooses as branching variable the variable .
i
such that i maximizes
{min{
i
(k). 1
i
(k)] [ i S anJ
IEPSILON _
i
(k) _ 1 IEPSILON]
PRIOR chooses as branching variable .
i
such that i S, .
o]t
(k)
i
is nonintegral, and
variable .
i
has the minimum value in the INTEGER row in the input data set. This
choice for the VARSELECT= option is recommended when you have enough
insight into the model to identify those integer variables that have the most
signicant effect on the objective value.
Integer Programming ! 223
PENALTY chooses as branching variable .
i
such that i S and a bound on the increase in
the objective of (LP(k)) (penalty) resulting from adding the constraint
.
i
_ ].
o]t
(k)
i
or .
i
_ {.
o]t
(k)
i

is maximized. The bound is calculated without pivoting using techniques of sen-

sitivity analysis (Garnkel and Nemhauser 1972). Because the cost of calculating
the maximum penalty can be large if S is large, you may want to limit the number
of variables in S for which the penalty is calculated. The penalty is calculated for
PENALTYDEPTH= variables in S.
PRICE chooses as branching variable .
i
such that i S, .
o]t
(k)
i
is nonintegral, and
variable .
i
has the minimum price coefcient (maximum for maximization).
PSEUDOC chooses as branching variable the variable .
i
such that i maximizes
{min{J
i

i
(k). u
i
(1
i
(k))] [ i S and
IEPSILON _
i
(k) _ 1 IEPSILON]
The weights u
i
and J
i
are initially equal to the absolute value of the i th objective
coefcient and are updated whenever an integer feasible solution is encountered.
See the discussion on the CANSELECT= option for details on the method of
updating the weights.
Customizing Search Heuristics
Often a good heuristic for searching the branch-and-bound tree of a problem can be found. You are
tempted to continue using this heuristic when the problem data changes but the problem structure
remains constant. The ability to reset procedure options interactively enables you to experiment
with search techniques in an attempt to identify approaches that perform well. Then you can easily
reapply these techniques to subsequent problems.
For example, the PIP branch-and-bound strategy (Crowder, Johnson, and Padberg 1983) describes
one such heuristic. The following program uses a similar strategy. Here, the OBJ rule (choose the
active node with least parent objective function in the case of a minimization problem) is used for
selecting the next active node to be solved until an integer feasible solution is found. Once such
a solution is found, the search procedure is changed to the LIFO rule: choose the problem most
recently placed in the list of active nodes.
proc lp canselect=obj ifeasiblepause=1;
run;
reset canselect=lifo ifeasiblepause=9999999;
run;
Further Discussion on AUTO and CONTROL= options
Consider a minimization problem. At each integer iteration, PROC LP will select a node to solve
from a pool of active nodes. The best bound strategy ( CANSELECT=OBJ) will pick the node with
224 ! Chapter 5: The LP Procedure
the smallest projected objective value. This strategy improves the lower bound of the integer program
and usually takes fewer integer iterations. One disadvantage is that PROC LP must recalculate the
inverse of the basis matrix at almost every integer iteration; such recalculation is relatively expensive.
Another disadvantage is that this strategy does not pay attention to improving the upper bound of the
integer program. Thus the number of active nodes tends to grow rapidly if PROC LP cannot quickly
nd an optimal integer solution.
On the other hand, the LIFO strategy is very efcient and does not need to calculate the inverse of
the basis matrix unless the previous node is fathomed. It is a depth-rst strategy so it tends to nd an
integer feasible solution quickly. However, this strategy will pick nodes locally and usually will take
longer integer iterations than the best bound strategy.
There is another strategy that is often overlooked. Here it is called the best upper bound strategy.
With this strategy, each time you select an active node, instead of picking the node with the smallest
projected objective value, you select the one with the largest projected objective value. This strategy
is as efcient as the LIFO strategy. Moreover, it selects active nodes globally. This strategy tries
to improve the upper bound of the integer program by searching for new integer feasible solutions.
It also fathoms active nodes quickly and keeps the total number of active nodes below the current
level. A disadvantage is that this strategy may evaluate more nodes that do not have any potential in
nding an optimal integer solution.
The best bound strategy has the advantage of improving the lower bound. The LIFO strategy has the
advantages of efciency and nding a local integer feasible solution. The best upper bound strategy
has the advantages of keeping the size of active nodes under control and at the same time trying to
identify any potential integer feasible solution globally.
Although the best bound strategy is generally preferred, in some instances other strategies may be
more effective. For example, if you have found an integer optimal solution but you do not know it,
you still have to enumerate all possible active nodes. Then the three strategies will basically take the
same number of integer iterations after an optimal solution is found but not yet identied. Since the
LIFO and best upper bound strategies are very efcient per integer iteration, both will outperform
the best bound strategy.
Since no one strategy suits all situations, a hybrid strategy has been developed to increase applicability.
The CONTROL= option combines the above three strategies naturally and provides a simple control
parameter in [0, 1] dealing with different integer programming problems and different solution
situations. The AUTO option automatically sets and adjusts the CONTROL= parameter so that you
do not need to know any problem structure or decide a node selection strategy in advance.
Since the LIFO strategy is less costly, you should use it as much as possible in the combinations. The
following process is called a diving process. Starting from an active node, apply the LIFO strategy
as much as you can until the current node becomes feasible or is fathomed, or exceeds a preset limit.
During this process, there is no inverse matrix calculation involved except for the rst node. When
the diving process is over, apply one of the three strategies to select the next starting node. One set
of combinations is called a cycle.
The control parameter r controls the frequency of the three strategies being applied and the depth of
the diving process in a cycle. It starts with a pure best bound strategy at r =0, and then gradually
increases the frequency of the diving processes and their depths as r increases. At r =0.5, one cycle
contains a best bound strategy plus a full diving process. After r = 0.5, the number of the diving
Sensitivity Analysis ! 225
processes will gradually increase in a cycle. In addition, the best upper bound strategy will join the
cycle. As r continues to increase, the frequency of the best upper bound strategy will increase. At
r =1, it becomes a pure best upper bound strategy.
The AUTO option will automatically adjust the value of the CONTROL= option. At the start, it sets
CONTROL=0.7, which emphasizes nding an upper bound. After an integer feasible solution is
found, it sets CONTROL=0.5, which emphasizes efciency and lower bound improvement. When
the number of active nodes grows over the default or user dened limit m, the number indicates that
a better upper bound is needed. The AUTO option will start to increase the value of CONTROL=
from 0.5. If the size of the active nodes continues to grow, so will the value of the CONTROL=
option. When the size of active nodes reaches to the default or user-dened limit n, CONTROL=
will be set to 1. At this moment, the growth of active nodes is stopped. When the size of active nodes
reduces, AUTO will decrease the value of CONTROL= option.
You can use other strategies to improve the lower bound by setting CANSELECT= to other options.
Saving and Restoring the List of Active Nodes
The list of active nodes can be saved in a SAS data set for use at a subsequent invocation of PROC
LP. The ACTIVEOUT= option in the PROC LP statement names the data set into which the current
list of active nodes is saved when the procedure terminates due to an error termination condition.
Examples of such conditions are time limit exceeded, integer iterations exceeded, and phase 3
iterations exceeded. The ACTIVEIN= option in the PROC LP statement names a data set that can
be used to initialize the list of active nodes. To achieve the greatest benet when restarting PROC
LP, use the PRIMALOUT= and PRIMALIN= options in conjunction with the ACTIVEOUT= and
ACTIVEIN= options. See Example 5.10 in the section Examples: LP Procedure on page 241 for
an illustration.
Sensitivity Analysis
Sensitivity analysis is a technique for examining the effects of changes in model parameters on the
optimal solution. The analysis enables you to examine the size of a perturbation to the right-hand-side
or objective vector by an arbitrary change vector for which the basis of the current optimal solution
remains optimal.
NOTE: When sensitivity analysis is performed on integer-constrained problems, the integer variables
are xed at the value they obtained in the integer optimal solution. Therefore, care must be used when
interpreting the results of such analyses. Care must also be taken when preprocessing is enabled,
because preprocessing usually alters the original formulation.
226 ! Chapter 5: The LP Procedure
Right-Hand-Side Sensitivity Analysis
Consider the problem (lr()):
.
o]t
() = min c
T
.
subject to . {_. =. _] b r
_ . _ u
where r is a right-hand-side change vector.
Let .
o]t
() denote an optimal basic feasible solution to (lr()). PROC LP can be used to examine
the effects of changes in on the solution .
o]t
(0) of problem (lr(0)) . For the basic solution
.
o]t
(0), let T be the matrix composed of the basic columns of and let N be the matrix composed
of the nonbasic columns of . For the basis matrix T, the basic components of .
o]t
(0), written as
.
o]t
(0)
B
, can be expressed as
.
o]t
(0)
B
= T
-1
(b N.
o]t
(0)
1
)
Furthermore, because .
o]t
(0) is feasible,

B
_ T
-1
(b N.
o]t
(0)
1
) _ u
B
where
B
is a column vector of the lower bounds on the structural basic variables, and u
B
is a
column vector of the upper bounds on the structural basic variables. For each right-hand-side change
vector r identied in the RHSSEN statement, PROC LP nds an interval
ni n
.
nox
| such that

B
_ T
-1
(b r N.
o]t
(0)
1
) _ u
B
for
ni n
.
nox
|. Furthermore, because changes in the right-hand side do not affect the reduced
costs, for
ni n
.
nox
|.
.
o]t
()
T
= ((T
-1
(b r N.
o]t
(0)
1
))
T
. .
o]t
(0)
T
1
)
is optimal in (lr()).
For =
ni n
and =
nox
, PROC LP reports the following:
v the names of the leaving variables
v the value of the optimal objective in the modied problems
v the RHS values in the modied problems
v the solution status, reduced costs and activities in the modied problems
The leaving variable identies the basic variable .
i
that rst reaches either the lower bound
i
or the
upper bound u
i
as reaches
ni n
or
nox
. This is the basic variable that would leave the basis to
maintain primal feasibility. Multiple RHSSEN variables can appear in a problem data set.
Sensitivity Analysis ! 227
Price Sensitivity Analysis
Consider the problem (l()):
.
o]t
() = min(c r)
T
.
subject to . {_. =. _] b
_ . _ u
where r is a price change vector.
Let .
o]t
() denote an optimal basic feasible solution to (l()). PROC LP can be used to examine
the effects of changes in on the solution .
o]t
(0) of problem (l(0)). For the basic solution
.
o]t
(0), let T be the matrix composed of the basic columns of and let N be the matrix composed
of the nonbasic columns of . For basis matrix T, the reduced cost associated with the i th variable
can be written as
rc
i
() = ((c r)
T
1
(c r)
T
B
T
-1
N)
i
where (c r)
1
and (c r)
B
is a partition of the vector of price coefcients into nonbasic and
basic components. Because .
o]t
(0) is optimal in (l(0)), the reduced costs satisfy
rc
i
() _ 0
if the nonbasic variable in column i is at its lower bound, and
rc
i
() _ 0
if the nonbasic variable in column i is at its upper bound.
For each price coefcient change vector r identied with the keyword PRICESEN in the TYPE
variable, PROC LP nds an interval
ni n
.
nox
| such that for
ni n
.
nox
|,
rc
i
() _ 0
if the nonbasic variable in column i is at its lower bound, and
rc
i
() _ 0
if the nonbasic variable in column i is at its upper bound. Because changes in the price coefcients
do not affect feasibility, for
ni n
.
nox
|, .
o]t
() is optimal in (l()). For =
ni n
and
=
nox
, PROC LP reports the following:
v the names of entering variables
v the value of the optimal objective in the modied problems
v the price coefcients in the modied problems
v the solution status, reduced costs, and activities in the modied problems
The entering variable identies the variable whose reduced cost rst goes to zero as reaches

ni n
or
nox
. This is the nonbasic variable that would enter the basis to maintain optimality (dual
feasibility). Multiple PRICESEN variables may appear in a problem data set.
228 ! Chapter 5: The LP Procedure
Range Analysis
Range analysis is sensitivity analysis for specic change vectors. As with the sensitivity analysis
case, care must be used in interpreting the results of range analysis when the problem has integers or
the preprocessing option is enabled.
Right-Hand-Side Range Analysis
The effects on the optimal solution of changes in each right-hand-side value can be studied using the
RANGERHS option in the PROC LP or RESET statement. This option results in sensitivity analysis
for the m right-hand-side change vectors specied by the columns of the m m identity matrix.
Price Range Analysis
The effects on the optimal solution of changes in each price coefcient can be studied using the
RANGEPRICE option in the PROC LP or RESET statement. This option results in sensitivity
analysis for the n price change vectors specied by the rows of the n n identity matrix.
Parametric Programming
Sensitivity analysis and range analysis examine how the optimal solution behaves with respect to
perturbations of model parameter values. These approaches assume that the basis at optimality is not
allowed to change. When greater exibility is desired and a change of basis is acceptable, parametric
programming can be used.
As with the sensitivity analysis case, care must be used in interpreting the results of parametric
programming when the problem has integers or the preprocessing option is enabled.
Right-Hand-Side Parametric Programming
As discussed in the section Right-Hand-Side Sensitivity Analysis on page 226, for each right-hand-
side change vector r, PROC LP nds an interval
ni n
.
nox
| such that for
ni n
.
nox
|.
.
o]t
()
T
= ((T
-1
(b r N.
o]t
(0)
1
))
T
. .
o]t
(0)
T
1
)
is optimal in (lr()) for the xed basis T. Leaving variables that inhibit further changes in
without a change in the basis T are associated with the quantities
ni n
and
nox
. By specifying
RHSPHI= in either the PROC LP statement or the RESET statement, you can examine the solution
.
o]t
() as increases or decreases from 0 to .
When RHSPHI= is specied, the procedure rst nds the interval
ni n
.
nox
| as described
previously. Then, if
ni n
.
nox
|, no further investigation is needed. However, if >
nox
or
Interactive Facilities ! 229
<
ni n
, then the procedure attempts to solve the new problem (lr()). To accomplish this, it
pivots the leaving variable out of the basis while maintaining dual feasibility. If this new solution is
primal feasible in (lr()), no further investigation is needed; otherwise, the procedure identies
the new leaving variable and pivots it out of the basis, again maintaining dual feasibility. Dual
pivoting continues in this manner until a solution that is primal feasible in (lr()) is identied.
Because dual feasibility is maintained at each pivot, the (lr()) primal feasible solution is optimal.
At each pivot, the procedure reports on the variables that enter and leave the basis, the current range
of , and the objective value. When .
o]t
() is found, it is displayed. If you want the solution
.
o]t
() at each pivot, then specify the PARAPRINT option in either the PROC LP or the RESET
statement.
Price Parametric Programming
As discussed in the section Price Sensitivity Analysis on page 227, for each price change vector r,
PROC LP nds an interval
ni n
.
nox
| such that for each
ni n
.
nox
|,
rc
i
() = ((c r)
T
1
(c r)
T
B
T
-1
N)
i
satises the conditions for optimality in (l()) for the xed basis T. Entering variables that
inhibit further changes in without a change in the basis T are associated with the quantities
ni n
and
nox
. By specifying PRICEPHI= in either the PROC LP statement or the RESET statement,
you can examine the solution .
o]t
() as increases or decreases from 0 to .
When PRICEPHI= is specied, the procedure rst nds the interval
ni n
.
nox
|, as described
previously. Then, if
ni n
.
nox
|, no further investigation is needed. However, if >
nox
or <
ni n
, the procedure attempts to solve the new problem (l()). To accomplish this, it
pivots the entering variable into the basis while maintaining primal feasibility. If this new solution
is dual feasible in (l()), no further investigation is needed; otherwise, the procedure identies
the new entering variable and pivots it into the basis, again maintaining primal feasibility. Pivoting
continues in this manner until a solution that is dual feasible in (l()) is identied. Because
primal feasibility is maintained at each pivot, the (l()) dual feasible solution is optimal.
At each pivot, the procedure reports on the variables that enter and leave the basis, the current range
of , and the objective value. When .
o]t
() is found, it is displayed. If you want the solution
.
o]t
() at each pivot, then specify the PARAPRINT option in either the PROC LP or the RESET
statement.
Interactive Facilities
The interactive features of the LP procedure enable you to examine intermediate results, perform
sensitivity analysis, parametric programming, and range analysis, and control the solution process.
230 ! Chapter 5: The LP Procedure
Controlling Interactive Features
You can gain control of the LP procedure for interactive processing by setting a breakpoint or pressing
the CTRL-BREAK key combination, or when certain error conditions are encountered:
v when a feasible solution is found
v at each pivot of the simplex algorithm
v when an integer feasible solution is found
v at each integer pivot of the branch-and-bound algorithm
v after the data are read but before iteration begins
v after at least one integer feasible solution has been found which is within desirable proximity
of optimality
v after the problem has been solved but before results are displayed
When the LP procedure pauses, you can enter any of the interactive statements RESET, PIVOT,
IPIVOT, PRINT, SHOW, QUIT, and RUN.
Breakpoints are set using the FEASIBLEPAUSE, PAUSE=, IFEASIBLEPAUSE=, IPAUSE=, PROX-
IMITYPAUSE=, READPAUSE, and ENDPAUSE options. The LP procedure displays a message on
the SAS log when it gives you control because of encountering one of these breakpoints.
During phase 1, 2, or 3, the CTRL-BREAK key pauses the LP procedure and releases the control at
the beginning of the next iteration.
The error conditions, which usually cause the LP procedure to pause, include time limit exceeded,
phase 1 iterations exceeded, phase 2 iterations exceeded, phase 3 iterations exceeded, and integer
iterations exceeded. You can use the RESET statement to reset the option that caused the error
condition.
The PIVOT and IPIVOT statements result in control being returned to you after a single simplex
algorithm pivot and an integer pivot. The PRINT and SHOW statements display current solution
information and return control to you. On the other hand, the QUIT statement requests that you leave
the LP procedure immediately. If you want to quit but save output data sets, then type QUIT/SAVE.
The RUN statement requests the LP procedure to continue its execution immediately.
Displaying Intermediate Results
Once you have control of the procedure, you can examine the current values of the options and the
status of the problem being solved using the SHOW statement. All displaying done by the SHOW
statement goes to the SAS log.
Details about the current status of the solution are obtained using the PRINT statement. The various
display options enable you to examine parts of the variable and constraint summaries, display the
current tableau, perform sensitivity analysis on the current solution, and perform range analysis.
Memory Management ! 231
Interactive Facilities in Batch Mode
All of the interactive statements can be used when processing in batch mode. This is particularly
convenient when the interactive facilities are used to combine different search strategies in solving
integer problems.
Sensitivity Analysis
Two features that enhance the ability to perform sensitivity analysis need further explanation. When
you specify /SENSITIVITY in a PRINT COLUMN(colnames) statement, the LP procedure denes a
new change row to use in sensitivity analysis and parametric programming. This new change row has
a +1 entry for each variable listed in the PRINT statement. This enables you to dene new change
rows interactively.
When you specify /SENSITIVITY in a PRINT ROW (rownames) statement, the LP procedure
denes a new change column to use in sensitivity analysis and parametric programming. This new
change column has a +1 entry for each right-hand-side coefcient listed in the PRINT statement.
This enables you to dene new change columns interactively.
In addition, you can interactively change the RHSPHI= and PRICEPHI= options using the RESET
statement. This enables you to perform parametric programming interactively.
Memory Management
There are no restrictions on the problem size in the LP procedure. The number of constraints and
variables in a problem that PROC LP can solve depends on the host platform, the available memory,
and the available disk space for utility data sets.
Memory usage is affected by a great many factors including the density of the technological
coefcient matrix, the model structure, and the density of the decomposed basis matrix. The
algorithm requires that the decomposed basis t completely in memory. Any additional memory is
used for nonbasic columns. The partition between the decomposed basis and the nonbasic columns
is dynamic so that as the inverse grows, which typically happens as iterations proceed, more memory
is available to it and less is available for the nonbasic columns.
The LP procedure determines the initial size of the decomposed basis matrix. If the area used is too
small, PROC LP must spend time compressing this matrix, which degrades performance. If PROC
LP must compress the decomposed basis matrix on the average more than 15 times per iteration,
then the size of the memory devoted to the basis is increased. If the work area cannot be made large
enough to invert the basis, an error return occurs. On the other hand, if PROC LP compresses the
decomposed basis matrix on the average once every other iteration, then memory devoted to the
decomposed basis is decreased, freeing memory for the nonbasic columns.
For many models, memory constraints are not a problem because both the decomposed basis and
all the nonbasic columns will have no problem tting. However, when the models become large
relative to the available memory, the algorithm tries to adjust memory distribution in order to solve
232 ! Chapter 5: The LP Procedure
the problem. In the worst cases, only one nonbasic column ts in memory with the decomposed
basis matrix.
Problems involving memory use can occur when solving mixed-integer problems. Data associated
with each node in the branch-and-bound tree must be kept in memory. As the tree grows, competition
for memory by the decomposed basis, the nonbasic columns, and the branch-and-bound tree may
become critical. If the situation becomes critical, the procedure automatically switches to branching
strategies that use less memory. However, it is possible to reach a point where no further processing
is possible. In this case, PROC LP terminates on a memory error.
Output Data Sets
The LP procedure can optionally produce ve output data sets. These are the ACTIVEOUT=,
PRIMALOUT=, DUALOUT=, TABLEAUOUT=, and MPSOUT= data sets. Each contains two
variables that identify the particular problem in the input data set. These variables are
_OBJ_ID_ identies the objective function ID.
_RHS_ID_ identies the right-hand-side variable.
Additionally, each data set contains other variables, which are discussed below.
ACTIVEOUT= Data Set
The ACTIVEOUT= data set contains a representation of the current active branch-and-bound
tree. You can use this data set to initialize the branch-and-bound tree to continue iterations on an
incompletely solved problem. Each active node in the tree generates two observations in this data
set. The rst is a LOWERBD observation that is used to reconstruct the lower-bound constraints
on the currently described active node. The second is an UPPERBD observation that is used to
reconstruct the upper-bound constraints on the currently described active node. In addition to these,
an observation that describes the current best integer solution is included. The data set contains the
following variables:
_STATUS_ contains the keywords LOWERBD, UPPERBD, and INTBEST for identifying
the type of observation.
_PROB_ contains the problem number for the current observation.
_OBJECT_ contains the objective value of the parent problem that generated the current
observations problem.
_SINFEA_ contains the sum of the integer infeasibilities of the current observations problem.
_PROJEC_ contains the data needed for CANSELECT=PROJECT when the branch-and-
bound tree is read using the ACTIVEIN= option.
_PSEUDO_ contains the data needed for CANSELECT=PSEUDOC when the branch-and-
bound tree is read using the ACTIVEIN= option.
Output Data Sets ! 233
INTEGER VARIABLES Integer-constrained structural variables are also included in the ACTIVE-
OUT= data set. For each observation, these variables contain values for dening
the active node in the branch-and-bound tree.
PRIMALOUT= Data Set
The PRIMALOUT= data set contains the current primal solution. If the problem has integer-
constrained variables, the PRIMALOUT= data set contains the current best integer feasible solution.
If none have been found, the PRIMALOUT= data set contains the relaxed solution. In addition to
_OBJ_ID_ and _RHS_ID_, the data set contains the following variables:
_VAR_ identies the variable name.
_TYPE_ identies the type of the variable as specied in the input data set. Articial
variables are labeled as type ARTIFCL.
_STATUS_ identies whether the variable is basic, nonbasic, or at an upper bound in the
current solution.
_LBOUND_ contains the input lower bound on the variable unless the variable is integer-
constrained and an integer solution is given. In this case, _LBOUND_ contains the
lower bound on the variable needed to realize the integer solution on subsequent
calls to PROC LP when using the PRIMALIN= option.
_VALUE_ identies the value of the variable in the current solution or the current best integer
feasible solution.
_UBOUND_ contains the input upper bound on the variable unless the variable is integer-
constrained and an integer solution is given. In this case, _UBOUND_ contains the
upper bound on the variable needed to realize the integer solution on subsequent
calls to PROC LP when using the PRIMALIN= option.
_PRICE_ contains the input price coefcient of the variable.
_R_COST_ identies the value of the reduced cost in the current solution. Example 5.3 in the
section Examples: LP Procedure on page 241 shows a typical PRIMALOUT=
data set. Note that it is necessary to include the information on objective function
and right-hand side in order to distinguish problems in multiple problem data
sets.
DUALOUT= Data Set
The DUALOUT= data set contains the dual solution for the current solution. If the problem has
integer-constrained variables, the DUALOUT= data set contains the dual for the current best integer
solution, if any. Otherwise it contains the dual for the relaxed solution. In addition to _OBJ_ID_ and
_RHS_ID_, it contains the following variables:
_ROW_ID_ identies the row or constraint name.
_TYPE_ identies the type of the row as specied in the input data set.
_RHS_ gives the value of the right-hand side on input.
234 ! Chapter 5: The LP Procedure
_L_RHS_ gives the lower bound for the row evaluated from the input right-hand-side value,
the TYPE of the row, and the value of the RANGE variable for the row.
_VALUE_ gives the value of the row, at optimality, excluding logical variables.
_U_RHS_ gives the upper bound for the row evaluated from the input right-hand-side value,
the TYPE of the row, and the value of the RANGE variable for the row.
_DUAL_ gives the value of the dual variable associated with the row.
TABLEAUOUT= Data Set
The TABLEAUOUT= data set contains the current tableau. Each observation, except for the rst,
corresponds to a basic variable in the solution. The observation labeled R_COSTS contains the
reduced costs c
T
1
c
T
B
T
-1
N. In addition to _OBJ_ID_ and _RHS_ID_, it contains the following
variables:
_BASIC_ gives the names of the basic variables in the solution.
INVB_R gives the values of T
-1
r , where r is the right-hand-side vector.
STRUCTURAL VARIABLES give the values in the tableau, namely T
-1
.
MPSOUT= Data Set
The MPSOUT= data set contains problem data converted from a PROC LP format into an MPS-
format SAS data set. The six elds, FIELD1 to FIELD6, in the MPSOUT= data set correspond to
the six columns in MPS standard. For more information about the MPS-format SAS data set, see
Chapter 16, The MPS-Format SAS Data Set.
Input Data Sets
In addition to the DATA= input data set, PROC LP recognizes the ACTIVEIN= and the PRIMALIN=
data sets.
ACTIVEIN= Data Set
The ACTIVEIN= data set contains a representation of the current active tree. The format is identical
to that of the ACTIVEOUT= data set.
PRIMALIN= Data Set
The format of the PRIMALIN= data set is identical to the PRIMALOUT= data set. PROC LP uses
the PRIMALIN= data set to identify variables at their upper bounds in the current solution and
variables that are basic in the current solution.
Displayed Output ! 235
You can add observations to the end of the problem data set if they dene cost (right-hand-side)
sensitivity change vectors and have PRICESEN (RHSSEN) types. This enables you to solve a
problem, save the solution in a SAS data set, and perform sensitivity analysis later. You can also use
the PRIMALIN= data set to restart problems that have not been completely solved or to which new
variables have been added.
Displayed Output
The output from the LP procedure is discussed in the following six sections:
v Problem Summary
v Solution Summary including a Variable Summary and a Constraint Summary
v Infeasible Information Summary
v RHS Sensitivity Analysis Summary (the RHS Range Analysis Summary is not discussed)
v Price Sensitivity Analysis Summary (the Price Range Analysis Summary is not discussed)
v Iteration Log
For integer-constrained problems, the procedure also displays an Integer Iteration Log. The descrip-
tion of this Log can be found in the section Integer Programming on page 217. When you request
that the tableau be displayed, the procedure displays the Current Tableau. The description of this can
be found in the section The Reduced Costs, Dual Activities, and Current Tableau on page 213.
A problem data set can contain a set of constraints with several right-hand sides and several objective
functions. PROC LP considers each combination of right-hand side and objective function as dening
a new linear programming problem and solves each, performing all specied sensitivity analysis
on each problem. For each problem dened, PROC LP displays a new sequence of output sections.
Example 5.1 in the section Examples: LP Procedure on page 241 discusses each of these elements.
The LP procedure produces the following displayed output by default.
The Problem Summary
The problem summary includes the
v type of optimization and the name of the objective row (as identied by the ID or ROW
variable)
v name of the SAS variable that contains the right-hand-side constants
v name of the SAS variable that contains the type keywords
v density of the coefcient matrix (the ratio of the number of nonzero elements to the number of
total elements) after the slack and surplus variables have been appended
236 ! Chapter 5: The LP Procedure
v number of each type of variable in the mathematical program
v number of each type of constraint in the mathematical program
The Solution Summary
The solution summary includes the
v termination status of the procedure
v objective value of the current solution
v number of phase 1 iterations that were completed
v number of phase 2 iterations that were completed
v number of phase 3 iterations that were completed
v number of integer iterations that were completed
v number of integer feasible solutions that were found
v number of initial basic feasible variables identied
v time used in solving the problem excluding reading the data and displaying the solution
v number of inversions of the basis matrix
v current value of several of the options
The Variable Summary
The variable summary includes the
v column number associated with each structural or logical variable in the problem
v name of each structural or logical variable in the problem. (PROC LP gives the logical
variables the name of the constraint ID. If no ID variable is specied, the procedure names the
logical variable _OBSn_, where n is the observation that describes the constraint.)
v variables status in the current solution. The status can be BASIC, DEGEN, ALTER, blank,
LOWBD, or UPPBD, depending upon whether the variable is a basic variable, a degenerate
variable (that is, a basic variable whose activity is at its input lower bound), a nonbasic variable
that can be brought into the basis to dene an alternate optimal solution, a nonbasic variable at
its default lower bound 0, a nonbasic variable at its lower bound, or a nonbasic variable at its
upper bound.
v type of variable (whether it is logical or structural, and, if structural, its bound type, or other
value restriction). See Example 5.1 for a list of possible types in the variable summary.
v value of the objective coefcient associated with each variable
Displayed Output ! 237
v activity of the variable in the current solution
v variables reduced cost in the current solution
The Constraint Summary
The constraint summary includes the
v constraint row number and its ID
v kind of constraint (whether it is an OBJECTIVE, LE, EQ, GE, RANGELE, RANGEEQ,
RANGEGE, or FREE row)
v number of the slack or surplus variable associated with the constraint row
v value of the right-hand-side constant associated with the constraint row
v current activity of the row (excluding logical variables)
v current activity of the dual variable (shadow price) associated with the constraint row
The Infeasible Information Summary
The infeasible information summary includes the
v name of the infeasible row or variable
v current activity for the row or variable
v type of the row or variable
v value of right-hand-side constant
v name of each nonzero and nonmissing variable in the row
v activity and upper and lower bounds for the variable
The RHS Sensitivity Analysis Summary
The RHS sensitivity analysis summary includes the
v value of
ni n
v leaving variable when =
ni n
v objective value when =
ni n
v value of
nox
238 ! Chapter 5: The LP Procedure
v leaving variable when =
nox
v objective value when =
nox
v column number and name of each logical and structural variable
v variables status when
ni n
.
nox
|
v variables reduced cost when
ni n
.
nox
|
v value of right-hand-side constant when =
ni n
v activity of the variable when =
ni n
v value of right-hand-side constant when =
nox
v activity of the variable when =
nox
The Price Sensitivity Analysis Summary
The price sensitivity analysis summary includes the
v value of
ni n
v entering variable when =
ni n
v objective value when =
ni n
v value of
nox
v entering variable when =
nox
v objective value when =
nox
v column number and name of each logical and structural variable
v variables status when
ni n
.
nox
|
v activity of the variable when
ni n
.
nox
|
v price of the variable when =
ni n
v variables reduced cost when =
ni n
v price of the variable when =
nox
v variables reduced cost when =
nox
ODS Table and Variable Names ! 239
The Iteration Log
The iteration log includes the
v phase number
v iteration number in each phase
v name of the leaving variable
v name of the entering variable
v variables reduced cost
v objective value
ODS Table and Variable Names
PROC LP assigns a name to each table it creates. You can use these names to select output tables
when using the Output Delivery System (ODS).
Table 5.9 ODS Tables Produced in PROC LP
Table Name Description Statement/Option
ProblemSummary Problem summary Default
SolutionSummary Solution summary Default
VariableSummary Variable summary Default
ConstraintSummary Constraint summary Default
IterationLog Iteration log FLOW
IntegerIterationLog Integer iteration log Default
PriceSensitivitySummary Price sensitivity analysis sum-
mary
Default, PRINT PRICESEN, or PRINT
COLUMN/SENSITIVITY
PriceActivities Price activities at
ni n
and

nox
Default, PRINT PRICESEN, or PRINT
COLUMN/SENSITIVITY
PriceActivity Price activity at
ni n
or
nox
PRICEPHI= and PARAPRINT
PriceParametricLog Price parametric program-
ming log
PRICEPHI=
PriceRangeSummary Price range analysis RANGEPRICE or PRINT RANGEPRICE
RhsSensitivitySummary RHS sensitivity analysis sum-
mary
Default, PRINT RHSSEN, or PRINT
ROW/SENSITIVITY
RhsActivities RHS activities at
ni n
and

nox
Default, PRINT RHSSEN, or PRINT
ROW/SENSITIVITY
RhsActivity RHS activity at
ni n
or
nox
RHSPHI= and PARAPRINT
RhsParametricLog RHS parametric programming
log
RHSPHI=
RhsRangeSummary RHS range analysis RANGERHS or PRINT RANGERHS
InfeasibilitySummary Infeasible rowor variable sum-
mary
Default
240 ! Chapter 5: The LP Procedure
Table 5.9 (continued)
Table Name Description Statement/Option
InfeasibilityActivity Variable activity in an infeasi-
ble row
Default
CurrentTableau Current tableau TABLEAUPRINT or PRINT TABLEAU
Matrix Technological matrix PRINT MATRIX
MatrixPicture Technological matrix picture PRINT MATRIX/PICTURE
MatrixPictureLegend Technological matrix picture
legend
PRINT MATRIX/PICTURE
The following table lists the variable names of the preceding tables used in the ODS template of the
LP procedure.
Table 5.10 Variable Names for the ODS Tables Produced in PROC LP
Table Name Variables
VariableSummary VarName, Status, Type, Price, Activity, ReducedCost
ConstraintSummary Row, RowName, Type, SSCol, Rhs, Activity, Dual
IterationLog Phase, Iteration, EnterVar, EnterCol, LeaveVar, LeaveCol, ReducedCost, Obj-
Value
IntegerIterationLog Iteration, Problem, Condition, Objective, Branch, Value, SumOfInfeas, Active,
Proximity
PriceActivities Col, VarName, Status, Activity, MinPrice, MinReducedCost, MaxPrice, MaxRe-
ducedCost
PriceActivity Col, VarName, Status, Activity, Price, ReducedCost
PriceParametricLog LeaveVar, LeaveCol, EnterVar, EnterCol, ObjValue, CurrentPhi
PriceRangeSummary Col, VarName, MinPrice, MinEnterVar, MinObj, MaxPrice, MaxEnterVar,
MaxObj
RhsActivities Col, VarName, Status, ReducedCost, MinRhs, MinActivity, MaxRhs, MaxActiv-
ity
RhsActivity Col, VarName, Status, ReducedCost, Rhs, Activity,
RhsParametricLog LeaveVar, LeaveCol, EnterVar, EnterCol, ObjValue, CurrentPhi
RhsRangeSummary RowName, MinRhs, MinLeaveVar, MinObj, MaxRhs, MaxLeaveVar, MaxObj
InfeasibilityActivity VarName, Coefcient, Activity, Lower, Upper
Memory Limit
The system option MEMSIZE sets a limit on the amount of memory used by the SAS System. If
you do not specify a value for this option, then the SAS System sets a default memory limit. Your
operating environment determines the actual size of the default memory limit, which is sufcient for
many applications. However, to solve most realistic optimization problems, the LP procedure might
require more memory. Increasing the memory limit can reduce the chance of an out-of-memory
condition.
Examples: LP Procedure ! 241
NOTE: The MEMSIZE system option is not available in some operating environments. See the
documentation for your operating environment for more information.
You can specify -MEMSIZE 0 to indicate all available memory should be used, but this setting
should be used with caution. In most operating environments, it is better to specify an adequate
amount of memory than to specify -MEMSIZE 0. For example, if you are running PROC OPTLP
to solve LP problems with only a few hundred thousand variables and constraints, -MEMSIZE
500M might be sufcient to allow the procedure to run without an out-of-memory condition. When
problems have millions of variables, -MEMSIZE 1000M or higher might be needed. These are
rules of thumbproblems with atypical structure, density, or other characteristics can increase the
optimizers memory requirements.
The MEMSIZE option can be specied at system invocation, on the SAS command line, or in a
conguration le. The syntax is described in the SAS Companion book for your operating system.
To report a procedures memory consumption, you can use the FULLSTIMER option. The syntax is
described in the SAS Companion book for your operating system.
Examples: LP Procedure
The following fteen examples illustrate some of the capabilities of PROC LP. These examples,
together with the other SAS/OR examples, can be found in the SAS sample library. A description of
the features of PROC LP as shown in the examples are
Example 5.1 dense input format
Example 5.2 sparse input format
Example 5.3 the RANGEPRICE option to show you the range over which each objective
coefcient can vary without changing the variables in the basis
Example 5.4 more sensitivity analysis and restarting a problem
Example 5.5 parametric programming
Example 5.6 special ordered sets
Example 5.7 goal programming
Example 5.8 integer programming
Example 5.9 an infeasible problem
Example 5.10 restarting integer programs
Example 5.11 controlling the search of the branch-and-bound tree
Example 5.12 matrix generation and report writing for an assignment problem
Example 5.13 matrix generation and report writing for a scheduling problem
Example 5.14 a multicommodity transshipment problem
242 ! Chapter 5: The LP Procedure
Example 5.1: An Oil Blending Problem
The blending problem presented in the introduction is a good example for demonstrating some of
the features of the LP procedure. Recall that a step in rening crude oil into nished oil products
involves a distillation process that splits crude into various streams. Suppose that there are three
types of crude available: Arabian light, Arabian heavy, and Brega. These are distilled into light
naphtha, intermediate naphtha, and heating oil. Using one of two recipes, these in turn are blended
into jet fuel.
Assume that you can sell as much fuel as is produced. What production strategy maximizes the prot
from jet fuel sales? The following SAS code demonstrates a way of answering this question using
linear programming. The SAS data set is a representation of the formulation for this model given in
the introductory section.
data;
input _row_ $17.
a_light a_heavy brega naphthal naphthai heatingo jet_1
jet_2 _type_ $ _rhs_;
datalines;
profit -175 -165 -205 0 0 0 300 300 max .
naphtha_l_conv .035 .030 .045 -1 0 0 0 0 eq 0
naphtha_i_conv .100 .075 .135 0 -1 0 0 0 eq 0
heating_o_conv .390 .300 .430 0 0 -1 0 0 eq 0
recipe_1 0 0 0 0 .3 .7 -1 0 eq 0
recipe_2 0 0 0 .2 0 .8 0 -1 eq 0
available 110 165 80 . . . . . upperbd .
;
The _ROW_ variable contains the names of the rows in the model; the variables A_LIGHT to JET_2
are the names of the structural variables in the model; the _TYPE_ variable contains the keywords
that tell the LP procedure how to interpret each row in the model; and the _RHS_ variable gives the
value of the right-hand-side constants.
The structural variables are interpreted as the quantity of each type of constituent or nished product.
For example, the value of A_HEAVY in the solution is the amount of Arabian heavy crude to buy
while the value of JET_1 in the solution is the amount of recipe 1 jet fuel that is produced. As
discussed previously, the values given in the model data set are the technological coefcients whose
interpretation depends on the model. In this example, the coefcient -175 in the PROFIT row for the
variable A_LIGHT gives a cost coefcient (because the row with _ROW_=PROFIT has _TYPE_=MAX)
for the structural variable A_LIGHT. This means that for each unit of Arabian heavy crude purchased,
a cost of 175 units is incurred.
The coefcients 0.035, 0.100, and 0.390 for the A_LIGHT variable give the percentages of each unit
of Arabian light crude that is distilled into the light naphtha, intermediate naphtha, and heating oil
components. The 110 value in the row _ROW_=AVAILABLE gives the quantity of Arabian light that
is available.
PROC LP produces the following Problem Summary output. Included in the summary is an
identication of the objective, dened by the rst observation of the problem data set; the right-hand-
Example 5.1: An Oil Blending Problem ! 243
side variable, dened by the variable _RHS_; and the type identier, dened by the variable _TYPE_.
See Output 5.1.1.
Output 5.1.1 Problem Summary for the Oil Blending Problem
The LP Procedure
Problem Summary
Objective Function Max profit
Rhs Variable _rhs_
Type Variable _type_
Problem Density (%) 45.00
Variables Number
Non-negative 5
Upper Bounded 3
Total 8
Constraints Number
EQ 5
Objective 1
Total 6
The next section of output (Output 5.1.2) contains the Solution Summary, which indicates whether
or not an optimal solution was found. In this example, the procedure terminates successfully (with
an optimal solution), with 1544 as the value of the objective function. Also included in this section
of output is the number of phase 1 and phase 2 iterations, the number of variables used in the initial
basic feasible solution, and the time used to solve the problem. For several options specied in the
PROC LP statement, the current option values are also displayed.
244 ! Chapter 5: The LP Procedure
Output 5.1.2 Solution Summary for the Oil Blending Problem
The LP Procedure
Solution Summary
Terminated Successfully
Objective Value 1544
Phase 1 Iterations 0
Phase 2 Iterations 4
Phase 3 Iterations 0
Integer Iterations 0
Integer Solutions 0
Initial Basic Feasible Variables 5
Time Used (seconds) 0
Number of Inversions 3
Epsilon 1E-8
Infinity 1.797693E308
Maximum Phase 1 Iterations 100
Maximum Phase 2 Iterations 100
Maximum Phase 3 Iterations 99999999
Maximum Integer Iterations 100
Time Limit (seconds) 120
The next section of output (Output 5.1.3) contains the Variable Summary. A line is displayed for
each variable in the mathematical program with the variable name, the status of the variable in
the solution, the type of variable, the variables price coefcient, the activity of the variable in the
solution, and the reduced cost for the variable. The status of a variable can be
BASIC if the variable is a basic variable in the solution.
DEGEN if the variable is a basic variable whose activity is at its input
lower bound.
ALTER if the variable is nonbasic and is basic in an alternate optimal
solution.
LOWBD if the variable is nonbasic and is at its lower bound.
UPPBD if the variable is nonbasic and is at its upper bound.
The TYPE column shows how PROC LP interprets the variable in the problem data set. Types
include the following:
NON-NEG if the variable is a nonnegative variable with lower bound 0
and upper bound o.
LOWERBD if the variable has a lower bound specied in a LOWERBD
observation and upper bound o.
Example 5.1: An Oil Blending Problem ! 245
UPPERBD if the variable has an upper bound that is less than oand
lower bound 0. This upper bound is specied in an UPPERBD
observation.
UPLOWBD if the variable has a lower bound specied in a LOWERBD
observation and an upper bound specied in an UPPERBD
observation.
INTEGER if the variable is constrained to take integer values. If this is
the case, then it must also be upper and lower bounded.
BINARY if the variable is constrained to take value 0 or 1.
UNRSTRT if the variable is an unrestricted variable having bounds of
oand o.
SLACK if the variable is a slack variable that PROC LP has appended
to a LE constraint. For variables of this type, the variable
name is the same as the name of the constraint (given in
the ROW variable) for which this variable is the slack. A
nonzero slack variable indicates that the constraint is not tight.
The slack is the amount by which the right-hand side of the
constraint exceeds the left-hand side.
SURPLUS if the variable is a surplus variable that PROC LP has ap-
pended to a GE constraint. For variables of this type, the
variable name is the same as the name of the constraint (given
in the ROW variable) for which this variable is the surplus. A
nonzero surplus variable indicates that the constraint is not
tight. The surplus is the amount by which the left-hand side
of the constraint exceeds the right-hand side.
The Variable Summary gives the value of the structural variables at optimality. In this example, it
tells you how to produce the jet fuel to maximize your prot. You should buy 110 units of A_LIGHT
and 80 units of BREGA. These are used to make 7.45 units of NAPHTHAL, 21.8 units of NAPHTHAI,
and 77.3 units of HEATINGO. These in turn are used to make 60.65 units of JET_1 using recipe 1 and
63.33 units of JET_2 using recipe 2.
246 ! Chapter 5: The LP Procedure
Output 5.1.3 Variable Summary for the Oil Blending Problem
The LP Procedure
Variable Summary
Reduced
Col Variable Name Status Type Price Activity Cost
1 a_light UPPBD UPPERBD -175 110 11.6
2 a_heavy UPPERBD -165 0 -21.45
3 brega UPPBD UPPERBD -205 80 3.35
4 naphthal BASIC NON-NEG 0 7.45 0
5 naphthai BASIC NON-NEG 0 21.8 0
6 heatingo BASIC NON-NEG 0 77.3 0
7 jet_1 BASIC NON-NEG 300 60.65 0
8 jet_2 BASIC NON-NEG 300 63.33 0
The reduced cost associated with each nonbasic variable is the marginal value of that variable if it is
brought into the basis. In other words, the objective function value would (assuming no constraints
were violated) increase by the reduced cost of a nonbasic variable if that variables value increased by
one. Similarly, the objective function value would (assuming no constraints were violated) decrease
by the reduced cost of a nonbasic variable if that variables value were decreased by one. Basic
variables always have a zero reduced cost. At optimality, for a maximization problem, nonbasic
variables that are not at an upper bound have nonpositive reduced costs (for example, A_HEAVY has a
reduced cost of -21.45). The objective would decrease if they were to increase beyond their optimal
values. Nonbasic variables at upper bounds have nonnegative reduced costs, showing that increasing
the upper bound (if the reduced cost is not zero) does not decrease the objective. For a nonbasic
variable at its upper bound, the reduced cost is the marginal value of increasing its upper bound,
often called its shadow price.
For minimization problems, the denition of reduced costs remains the same but the conditions
for optimality change. For example, at optimality the reduced costs of all non-upper-bounded
variables are nonnegative, and the reduced costs of upper-bounded variables at their upper bound are
nonpositive.
The next section of output (Output 5.1.4) contains the Constraint Summary. For each constraint
row, free row, and objective row, a line is displayed in the Constraint Summary. Included on the
line are the constraint name, the row type, the slack or surplus variable associated with the row, the
right-hand-side constant associated with the row, the activity of the row (not including the activity of
the slack and surplus variables), and the dual activity (shadow prices).
A dual variable is associated with each constraint row. At optimality, the value of this variable, the
dual activity, tells you the marginal value of the right-hand-side constant. For each unit increase in
the right-hand-side constant, the objective changes by this amount. This quantity is also known as
the shadow price. For example, the marginal value for the right-hand-side constant of constraint
HEATING_O_CONV is -450.
Example 5.2: A Sparse View of the Oil Blending Problem ! 247
Output 5.1.4 Constraint Summary for the Oil Blending Problem
The LP Procedure
Constraint Summary
S/S Dual
Row Constraint Name Type Col Rhs Activity Activity
1 profit OBJECTVE . 0 1544 .
2 naphtha_l_conv EQ . 0 0 -60
3 naphtha_i_conv EQ . 0 0 -90
4 heating_o_conv EQ . 0 0 -450
5 recipe_1 EQ . 0 0 -300
6 recipe_2 EQ . 0 0 -300
Example 5.2: A Sparse View of the Oil Blending Problem
Typically, mathematical programming models are very sparse. This means that only a small per-
centage of the coefcients are nonzero. The sparse problem input is ideal for these models. The oil
blending problem in the section An Introductory Example on page 175 has a sparse form. This
example shows the same problem in a sparse form with the data given in a different order. In addition
to representing the problem in a concise form, the sparse format
v allows long column names
v enables easy matrix generation (see Example 5.12, Example 5.13, and Example 5.14)
v is compatible with MPS sparse format
The model in the sparse format is solved by invoking PROC LP with the SPARSEDATA option as
follows.
data oil;
format _type_ $8. _col_ $14. _row_ $16. ;
input _type_ $ _col_ $ _row_ $ _coef_ ;
datalines;
max . profit .
. arabian_light profit -175
. arabian_heavy profit -165
. brega profit -205
. jet_1 profit 300
. jet_2 profit 300
eq . napha_l_conv .
. arabian_light napha_l_conv .035
. arabian_heavy napha_l_conv .030
. brega napha_l_conv .045
. naphtha_light napha_l_conv -1
eq . napha_i_conv .
248 ! Chapter 5: The LP Procedure
. arabian_light napha_i_conv .100
. arabian_heavy napha_i_conv .075
. brega napha_i_conv .135
. naphtha_inter napha_i_conv -1
eq . heating_oil_conv .
. arabian_light heating_oil_conv .390
. arabian_heavy heating_oil_conv .300
. brega heating_oil_conv .430
. heating_oil heating_oil_conv -1
eq . recipe_1 .
. naphtha_inter recipe_1 .3
. heating_oil recipe_1 .7
eq . recipe_2 .
. jet_1 recipe_1 -1
. naphtha_light recipe_2 .2
. heating_oil recipe_2 .8
. jet_2 recipe_2 -1
. _rhs_ profit 0
upperbd . available .
. arabian_light available 110
. arabian_heavy available 165
. brega available 80
;
proc lp SPARSEDATA;
run;
The output from PROC LP follows.
Output 5.2.1 Output for the Sparse Oil Blending Problem
The LP Procedure
Problem Summary
Objective Function Max profit
Rhs Variable _rhs_
Type Variable _type_
Problem Density (%) 45.00
Variables Number
Non-negative 5
Upper Bounded 3
Total 8
Constraints Number
EQ 5
Objective 1
Total 6
Example 5.2: A Sparse View of the Oil Blending Problem ! 249
The LP Procedure
Solution Summary
Terminated Successfully
Objective Value 1544
Phase 1 Iterations 0
Phase 2 Iterations 5
Phase 3 Iterations 0
Integer Iterations 0
Integer Solutions 0
Initial Basic Feasible Variables 5
Time Used (seconds) 0
Number of Inversions 3
Epsilon 1E-8
Infinity 1.797693E308
Maximum Phase 1 Iterations 100
Maximum Phase 2 Iterations 100
Maximum Phase 3 Iterations 99999999
Maximum Integer Iterations 100
Time Limit (seconds) 120
The LP Procedure
Variable Summary
Reduced
Col Variable Name Status Type Price Activity Cost
1 arabian_heavy UPPERBD -165 0 -21.45
2 arabian_light UPPBD UPPERBD -175 110 11.6
3 brega UPPBD UPPERBD -205 80 3.35
4 heating_oil BASIC NON-NEG 0 77.3 0
5 jet_1 BASIC NON-NEG 300 60.65 0
6 jet_2 BASIC NON-NEG 300 63.33 0
7 naphtha_inter BASIC NON-NEG 0 21.8 0
8 naphtha_light BASIC NON-NEG 0 7.45 0
250 ! Chapter 5: The LP Procedure
The LP Procedure
Constraint Summary
S/S Dual
Row Constraint Name Type Col Rhs Activity Activity
1 profit OBJECTVE . 0 1544 .
2 napha_l_conv EQ . 0 0 -60
3 napha_i_conv EQ . 0 0 -90
4 heating_oil_conv EQ . 0 0 -450
5 recipe_1 EQ . 0 0 -300
6 recipe_2 EQ . 0 0 -300
Example 5.3: Sensitivity Analysis: Changes in Objective Coefcients
Simple solution of a linear program is often not enough. A manager needs to evaluate how sensitive
the solution is to changing assumptions. The LP procedure provides several tools that are useful for
what if, or sensitivity, analysis. One tool studies the effects of changes in the objective coefcients.
For example, in the oil blending problem, the cost of crude and the selling price of jet fuel can
be highly variable. If you want to know the range over which each objective coefcient can vary
without changing the variables in the basis, you can use the RANGEPRICE option in the PROC LP
statement.
proc lp data=oil sparsedata
rangeprice primalout=solution;
run;
In addition to the Problem and Solution summaries, the LP procedure produces a Price Range
Summary, shown in Output 5.3.1.
For each structural variable, the upper and lower ranges of the price (objective function coefcient)
and the objective value are shown. The blocking variables, those variables that would enter the basis
if the objective coefcient were perturbed further, are also given. For example, the output shows
that if the cost of ARABIAN_LIGHT crude were to increase from 175 to 186.6 per unit (remember
that you are maximizing prot so the ARABIAN_LIGHT objective coefcient would decrease from
-175 to -186.6), then it would become optimal to use less of this crude for any fractional increase
in its cost. Increasing the unit cost to 186.6 would drive its reduced cost to zero. Any additional
increase would drive its reduced cost negative and would destroy the optimality conditions; thus,
you would want to use less of it in your processing. The output shows that, at the point where the
reduced cost is zero, you would only be realizing a prot of 268 = 1544 - (110 11.6) and that
ARABIAN_LIGHT enters the basis, that is, leaves its upper bound. On the other hand, if the cost of
ARABIAN_HEAVY were to decrease to 143.55, you would want to stop using the formulation of
110 units of ARABIAN_LIGHT and 80 units of BREGA and switch to a production scheme that
included ARABIAN_HEAVY, in which case the prot would increase from the 1544 level.
Example 5.3: Sensitivity Analysis: Changes in Objective Coefcients ! 251
Output 5.3.1 Price Range Summary for the Oil Blending Problem
The LP Procedure
Problem Summary
Objective Function Max profit
Rhs Variable _rhs_
Type Variable _type_
Problem Density (%) 45.00
Variables Number
Non-negative 5
Upper Bounded 3
Total 8
Constraints Number
EQ 5
Objective 1
Total 6
Solution Summary
Terminated Successfully
Objective Value 1544
Phase 1 Iterations 0
Phase 2 Iterations 5
Phase 3 Iterations 0
Integer Iterations 0
Integer Solutions 0
Initial Basic Feasible Variables 5
Time Used (seconds) 0
Number of Inversions 3
Epsilon 1E-8
Infinity 1.797693E308
Maximum Phase 1 Iterations 100
Maximum Phase 2 Iterations 100
Maximum Phase 3 Iterations 99999999
Maximum Integer Iterations 100
Time Limit (seconds) 120
252 ! Chapter 5: The LP Procedure
Output 5.3.1 continued
Variable Summary
Reduced
Col Variable Name Status Type Price Activity Cost
1 arabian_heavy UPPERBD -165 0 -21.45
2 arabian_light UPPBD UPPERBD -175 110 11.6
3 brega UPPBD UPPERBD -205 80 3.35
4 heating_oil BASIC NON-NEG 0 77.3 0
5 jet_1 BASIC NON-NEG 300 60.65 0
6 jet_2 BASIC NON-NEG 300 63.33 0
7 naphtha_inter BASIC NON-NEG 0 21.8 0
8 naphtha_light BASIC NON-NEG 0 7.45 0
Constraint Summary
S/S Dual
Row Constraint Name Type Col Rhs Activity Activity
1 profit OBJECTVE . 0 1544 .
2 napha_l_conv EQ . 0 0 -60
3 napha_i_conv EQ . 0 0 -90
4 heating_oil_conv EQ . 0 0 -450
5 recipe_1 EQ . 0 0 -300
6 recipe_2 EQ . 0 0 -300
Price Range Analysis
-------------Minimum Phi------------
Col Variable Name Price Entering Objective
1 arabian_heavy -INFINITY . 1544
2 arabian_light -186.6 arabian_light 268
3 brega -208.35 brega 1276
4 heating_oil -7.790698 brega 941.77907
5 jet_1 290.19034 brega 949.04392
6 jet_2 290.50992 brega 942.99292
7 naphtha_inter -24.81481 brega 1003.037
8 naphtha_light -74.44444 brega 989.38889
Price Range Analysis
-------------Maximum Phi------------
Col Price Entering Objective
1 -143.55 arabian_heavy 1544
2 INFINITY . INFINITY
3 INFINITY . INFINITY
4 71.5 arabian_heavy 7070.95
5 392.25806 arabian_heavy 7139.4516
6 387.19512 arabian_heavy 7066.0671
7 286 arabian_heavy 7778.8
8 715 arabian_heavy 6870.75
Example 5.4: Additional Sensitivity Analysis ! 253
Note that in the PROC LP statement, the PRIMALOUT= SOLUTION option was given. This caused
the procedure to save the optimal solution in a SAS data set named SOLUTION. This data set can
be used to perform further analysis on the problem without having to restart the solution process.
Example 5.4 shows how this is done. A display of the data follows in Output 5.3.2.
Output 5.3.2 The PRIMALOUT= Data Set for the Oil Blending Problem
Obs _OBJ_ID_ _RHS_ID_ _VAR_ _TYPE_ _STATUS_
1 profit _rhs_ arabian_heavy UPPERBD
2 profit _rhs_ arabian_light UPPERBD _UPPER_
3 profit _rhs_ brega UPPERBD _UPPER_
4 profit _rhs_ heating_oil NON-NEG _BASIC_
5 profit _rhs_ jet_1 NON-NEG _BASIC_
6 profit _rhs_ jet_2 NON-NEG _BASIC_
7 profit _rhs_ naphtha_inter NON-NEG _BASIC_
8 profit _rhs_ naphtha_light NON-NEG _BASIC_
9 profit _rhs_ PHASE_1_OBJECTIV OBJECT _DEGEN_
10 profit _rhs_ profit OBJECT _BASIC_
Obs _LBOUND_ _VALUE_ _UBOUND_ _PRICE_ _R_COST_
1 0 0.00 165 -165 -21.45
2 0 110.00 110 -175 11.60
3 0 80.00 80 -205 3.35
4 0 77.30 1.7977E308 0 0.00
5 0 60.65 1.7977E308 300 0.00
6 0 63.33 1.7977E308 300 0.00
7 0 21.80 1.7977E308 0 -0.00
8 0 7.45 1.7977E308 0 0.00
9 0 0.00 0 0 0.00
10 0 1544.00 1.7977E308 0 0.00
Example 5.4: Additional Sensitivity Analysis
The objective coefcient ranging analysis, discussed in the last example, is useful for assessing the
effects of changing costs and returns on the optimal solution if each objective function coefcient is
modied in isolation. However, this is often not the case.
Suppose you anticipate that the cost of crude will be increasing and you want to examine how
that will affect your optimal production plans. Furthermore, you estimate that if the price of
ARABIAN_LIGHT goes up by 1 unit, then the price of ARABIAN_HEAVY will rise by 1.2 units
and the price of BREGA will increase by 1.5 units. However, you plan on passing some of your
increased overhead on to your jet fuel customers, and you decide to increase the price of jet fuel 1
unit for each unit of increased cost of ARABIAN_LIGHT.
An examination of the solution sensitivity to changes in the cost of crude is a two-step process. First,
add the information on the proportional rates of change in the crude costs and the jet fuel price to
the problem data set. Then, invoke the LP procedure. The following program accomplishes this.
First, it adds a new row, named CHANGE, to the model. It gives this row a type of PRICESEN. That
254 ! Chapter 5: The LP Procedure
tells PROC LP to perform objective function coefcient sensitivity analysis using the given rates of
change. The program then invokes PROC LP to perform the analysis. Notice that the PRIMALIN=
SOLUTION option is used in the PROC LP statement. This tells the LP procedure to use the saved
solution. Although it is not necessary to do this, it will eliminate the need for PROC LP to re-solve
the problem and can save computing time.
data sen;
format _type_ $8. _col_ $14. _row_ $6.;
input _type_ $ _col_ $ _row_ $ _coef_;
datalines;
pricesen . change .
. arabian_light change 1
. arabian_heavy change 1.2
. brega change 1.5
. jet_1 change -1
. jet_2 change -1
;
data;
set oil sen;
run;
proc lp sparsedata primalin=solution;
run;
Output 5.4.1 shows the range over which the current basic solution remains optimal so that the
current production plan need not change. The objective coefcients are modied by adding times
the change vector given in the SEN data set, where ranges from a minimum of -4.15891 to a
maximum of 29.72973. At the minimum value of , the prot decreases to 1103.073. This value
of corresponds to an increase in the cost of ARABIAN_HEAVY to 169.99 (namely, 175 +
1.2), ARABIAN_LIGHT to 179.16 (= 175 + 1), and BREGA to 211.24 (= 205 + 1.5),
and corresponds to an increase in the price of JET_1 and JET_2 to 304.16 (= 300 + (-1)). These
values can be found in the Price column under the section labeled Minimum Phi.
Example 5.4: Additional Sensitivity Analysis ! 255
Output 5.4.1 The Price Sensitivity Analysis Summary for the Oil Blending Problem
The LP Procedure
Problem Summary
Objective Function Max profit
Rhs Variable _rhs_
Type Variable _type_
Problem Density (%) 45.00
Variables Number
Non-negative 5
Upper Bounded 3
Total 8
Constraints Number
EQ 5
Objective 1
Total 6
Solution Summary
Terminated Successfully
Objective Value 1544
Phase 1 Iterations 0
Phase 2 Iterations 0
Phase 3 Iterations 0
Integer Iterations 0
Integer Solutions 0
Initial Basic Feasible Variables 7
Time Used (seconds) 0
Number of Inversions 2
Epsilon 1E-8
Infinity 1.797693E308
Maximum Phase 1 Iterations 100
Maximum Phase 2 Iterations 100
Maximum Phase 3 Iterations 99999999
Maximum Integer Iterations 100
Time Limit (seconds) 120
256 ! Chapter 5: The LP Procedure
Output 5.4.1 continued
Variable Summary
Reduced
Col Variable Name Status Type Price Activity Cost
1 arabian_heavy UPPERBD -165 0 -21.45
2 arabian_light UPPBD UPPERBD -175 110 11.6
3 brega UPPBD UPPERBD -205 80 3.35
4 heating_oil BASIC NON-NEG 0 77.3 0
5 jet_1 BASIC NON-NEG 300 60.65 0
6 jet_2 BASIC NON-NEG 300 63.33 0
7 naphtha_inter BASIC NON-NEG 0 21.8 0
8 naphtha_light BASIC NON-NEG 0 7.45 0
Constraint Summary
S/S Dual
Row Constraint Name Type Col Rhs Activity Activity
1 profit OBJECTVE . 0 1544 .
2 napha_l_conv EQ . 0 0 -60
3 napha_i_conv EQ . 0 0 -90
4 heating_oil_conv EQ . 0 0 -450
5 recipe_1 EQ . 0 0 -300
6 recipe_2 EQ . 0 0 -300
The LP Procedure
Price Sensitivity Analysis Summary
Sensitivity Vector change
Minimum Phi -4.158907511
Entering Variable brega
Optimal Objective 1103.0726257
Maximum Phi 29.72972973
Entering Variable arabian_heavy
Optimal Objective 4695.9459459
----Minimum Phi---- ----Maximum Phi----
Reduced Reduced
Col Variable Name Status Activity Price Cost Price Cost
1 arabian_heavy 0 -169.9907 -24.45065 -129.3243 0
2 arabian_light UPPBD 110 -179.1589 10.027933 -145.2703 22.837838
3 brega UPPBD 80 -211.2384 0 -160.4054 27.297297
4 heating_oil BASIC 77.3 0 0 0 0
5 jet_1 BASIC 60.65 304.15891 0 270.27027 0
6 jet_2 BASIC 63.33 304.15891 0 270.27027 0
7 naphtha_inter BASIC 21.8 0 0 0 0
8 naphtha_light BASIC 7.45 0 0 0 0
The Price Sensitivity Analysis Summary also shows the effects of lowering the cost of crude and
lowering the price of jet fuel. In particular, at the maximum of 29.72973, the current optimal
Example 5.5: Price Parametric Programming for the Oil Blending Problem ! 257
production plan yields a prot of 4695.95. Any increase or decrease in beyond the limits given
results in a change in the production plan. More precisely, the columns that constitute the basis
change.
Example 5.5: Price Parametric Programming for the Oil Blending
Problem
This example continues to examine the effects of a change in the cost of crude and the selling
price of jet fuel. Suppose that you know the cost of ARABIAN_LIGHT crude is likely to increase
30 units, with the effects on oil and fuel prices as described in Example 5.4. The analysis in the
last example only accounted for an increase of a little over 4 units (because the minimum was
-4.15891). Because an increase in the cost of ARABIAN_LIGHT beyond 4.15891 units requires a
change in the optimal basis, it may also require a change in the optimal production strategy. This
type of analysis, where you want to nd how the solution changes with changes in the objective
function coefcients or right-hand-side vector, is called parametric programming.
You can answer this question by using the PRICEPHI= option in the PROC LP statement. The
following program instructs PROC LP to continually increase the cost of the crudes and the return
from jet fuel using the ratios given previously, until the cost of ARABIAN_LIGHT increases at least
30 units.
proc lp sparsedata primalin=solution pricephi=-30;
run;
The PRICEPHI= option in the PROC LP statement tells PROC LP to perform parametric program-
ming on any price change vectors specied in the problem data set. The value of the PRICEPHI=
option tells PROC LP how far to change the value of and in what direction. A specication of
PRICEPHI=-30 tells PROC LP to continue pivoting until the problem has objective function equal to
(original objective function value) 30 (change vector).
Output 5.5.1 shows the result of this analysis. The rst page is the Price Sensitivity Analysis
Summary, as discussed in Example 5.4. The next page is an accounting for the change in basis as a
result of decreasing beyond -4.1589. It shows that BREGA left the basis at an upper bound and
entered the basis at a lower bound. The interpretation of these basis changes can be difcult (Hadley
1962; Dantzig 1963).
The last page of output shows the optimal solution at the displayed value of , namely -30.6878.
At an increase of 30.6878 units in the cost of ARABIAN_LIGHT and the related changes to the
other crudes and the jet fuel, it is optimal to modify the production of jet fuel as shown in the activity
column. Although this plan is optimal, it results in a prot of 0. This may suggest that the ratio of a
unit increase in the price of jet fuel per unit increase in the cost of ARABIAN_LIGHT is lower than
desirable.
258 ! Chapter 5: The LP Procedure
Output 5.5.1 Price Parametric Programming for the Oil Blending Problem
The LP Procedure
Problem Summary
Objective Function Max profit
Rhs Variable _rhs_
Type Variable _type_
Problem Density (%) 45.00
Variables Number
Non-negative 5
Upper Bounded 3
Total 8
Constraints Number
EQ 5
Objective 1
Total 6
Solution Summary
Terminated Successfully
Objective Value 1544
Phase 1 Iterations 0
Phase 2 Iterations 0
Phase 3 Iterations 0
Integer Iterations 0
Integer Solutions 0
Initial Basic Feasible Variables 7
Time Used (seconds) 0
Number of Inversions 2
Epsilon 1E-8
Infinity 1.797693E308
Maximum Phase 1 Iterations 100
Maximum Phase 2 Iterations 100
Maximum Phase 3 Iterations 99999999
Maximum Integer Iterations 100
Time Limit (seconds) 120
Example 5.5: Price Parametric Programming for the Oil Blending Problem ! 259
Output 5.5.1 continued
Variable Summary
Reduced
Col Variable Name Status Type Price Activity Cost
1 arabian_heavy UPPERBD -165 0 -21.45
2 arabian_light UPPBD UPPERBD -175 110 11.6
3 brega UPPBD UPPERBD -205 80 3.35
4 heating_oil BASIC NON-NEG 0 77.3 0
5 jet_1 BASIC NON-NEG 300 60.65 0
6 jet_2 BASIC NON-NEG 300 63.33 0
7 naphtha_inter BASIC NON-NEG 0 21.8 0
8 naphtha_light BASIC NON-NEG 0 7.45 0
Constraint Summary
S/S Dual
Row Constraint Name Type Col Rhs Activity Activity
1 profit OBJECTVE . 0 1544 .
2 napha_l_conv EQ . 0 0 -60
3 napha_i_conv EQ . 0 0 -90
4 heating_oil_conv EQ . 0 0 -450
5 recipe_1 EQ . 0 0 -300
6 recipe_2 EQ . 0 0 -300
The LP Procedure
Price Sensitivity Analysis Summary
Sensitivity Vector change
Minimum Phi -4.158907511
Entering Variable brega
Optimal Objective 1103.0726257
Maximum Phi 29.72972973
Entering Variable arabian_heavy
Optimal Objective 4695.9459459
----Minimum Phi---- ----Maximum Phi----
Reduced Reduced
Col Variable Name Status Activity Price Cost Price Cost
1 arabian_heavy 0 -169.9907 -24.45065 -129.3243 0
2 arabian_light UPPBD 110 -179.1589 10.027933 -145.2703 22.837838
3 brega UPPBD 80 -211.2384 0 -160.4054 27.297297
4 heating_oil BASIC 77.3 0 0 0 0
5 jet_1 BASIC 60.65 304.15891 0 270.27027 0
6 jet_2 BASIC 63.33 304.15891 0 270.27027 0
7 naphtha_inter BASIC 21.8 0 0 0 0
8 naphtha_light BASIC 7.45 0 0 0 0
260 ! Chapter 5: The LP Procedure
Output 5.5.1 continued
The LP Procedure
Price Parametric Programming Log
Sensitivity Vector change
Entering Current
Leaving Variable Variable Objective Phi
brega brega 1103.0726 -4.158908
The LP Procedure
Price Sensitivity Analysis Summary
Sensitivity Vector change
Minimum Phi -30.68783069
Entering Variable arabian_light
Optimal Objective 0
----Minimum Phi----
Reduced
Col Variable Name Status Activity Price Cost
1 arabian_heavy 0 -201.8254 -43.59127
2 arabian_light ALTER 110 -205.6878 0
3 brega 0 -251.0317 -21.36905
4 heating_oil BASIC 42.9 0 0
5 jet_1 BASIC 33.33 330.68783 0
6 jet_2 BASIC 35.09 330.68783 0
7 naphtha_inter BASIC 11 0 0
8 naphtha_light BASIC 3.85 0 0
What is the optimal return if is exactly -30? Because the change in the objective is linear as a
function of , you can calculate the objective for any value of between those given by linear
interpolation. For example, for any between -4.1589 and -30.6878, the optimal objective value is
(1103.0726 0),(4.1589 30.6878) b
where
b = 30.6878 (1103.0726 0),(4.1589 30.6878)
For =-30, this is 28.5988.
Example 5.6: Special Ordered Sets and the Oil Blending Problem ! 261
Example 5.6: Special Ordered Sets and the Oil Blending Problem
Often managers want to evaluate the cost of making a choice among alternatives. In particular,
they want to make the most protable choice. Suppose that only one oil crude can be used in the
production process. This identies a set of variables of which only one can be above its lower bound.
This additional restriction could be included in the model by adding a binary integer variable for each
of the three crudes. Constraints would be needed that would drive the appropriate binary variable to
1 whenever the corresponding crude is used in the production process. Then a constraint limiting the
total of these variables to only one would be added. A similar formulation for a xed charge problem
is shown in Example 5.8.
The SOSLE type implicitly does this. The following DATA step adds a row to the model that
identies which variables are in the set. The SOSLE type tells the LP procedure that only one of the
variables in this set can be above its lower bound. If you use the SOSEQ type, it tells PROC LP that
exactly one of the variables in the set must be above its lower bound. Only integer variables can be
in an SOSEQ set.
data special;
format _type_ $6. _col_ $14. _row_ $8. ;
input _type_ $ _col_ $ _row_ $ _coef_;
datalines;
SOSLE . special .
. arabian_light special 1
. arabian_heavy special 1
. brega special 1
;
data;
set oil special;
run;
proc lp sparsedata;
run;
Output 5.6.1 includes an Integer Iteration Log. This log shows the progress that PROC LP is making
in solving the problem. This is discussed in some detail in Example 5.8.
262 ! Chapter 5: The LP Procedure
Output 5.6.1 The Oil Blending Problem with a Special Ordered Set
The LP Procedure
Problem Summary
Objective Function Max profit
Rhs Variable _rhs_
Type Variable _type_
Problem Density (%) 45.00
Variables Number
Non-negative 5
Upper Bounded 3
Total 8
Constraints Number
EQ 5
Objective 1
Total 6
Integer Iteration Log
Iter Problem Condition Objective Branched Sinfeas Active Proximity
1 0 ACTIVE 1544 arabian_light 0 2 .
2 -1 SUBOPTIMAL 1276 . . 1 268
3 1 FATHOMED 268 . . 0 .
Solution Summary
Integer Optimal Solution
Objective Value 1276
Phase 1 Iterations 0
Phase 2 Iterations 5
Phase 3 Iterations 0
Integer Iterations 3
Integer Solutions 1
Initial Basic Feasible Variables 5
Time Used (seconds) 0
Number of Inversions 5
Epsilon 1E-8
Infinity 1.797693E308
Maximum Phase 1 Iterations 100
Maximum Phase 2 Iterations 100
Maximum Phase 3 Iterations 99999999
Maximum Integer Iterations 100
Time Limit (seconds) 120
Example 5.7: Goal-Programming a Product Mix Problem ! 263
Output 5.6.1 continued
Variable Summary
Reduced
Col Variable Name Status Type Price Activity Cost
1 arabian_heavy UPPERBD -165 0 -21.45
2 arabian_light UPPBD UPPERBD -175 110 11.6
3 brega UPPERBD -205 0 3.35
4 heating_oil BASIC NON-NEG 0 42.9 0
5 jet_1 BASIC NON-NEG 300 33.33 0
6 jet_2 BASIC NON-NEG 300 35.09 0
7 naphtha_inter BASIC NON-NEG 0 11 0
8 naphtha_light BASIC NON-NEG 0 3.85 0
Constraint Summary
S/S Dual
Row Constraint Name Type Col Rhs Activity Activity
1 profit OBJECTVE . 0 1276 .
2 napha_l_conv EQ . 0 0 -60
3 napha_i_conv EQ . 0 0 -90
4 heating_oil_conv EQ . 0 0 -450
5 recipe_1 EQ . 0 0 -300
6 recipe_2 EQ . 0 0 -300
The solution shows that only the ARABIAN_LIGHT crude is purchased. The requirement that only
one crude be used in the production is met, and the prot is 1276. This tells you that the value of
purchasing crude from an additional source, namely BREGA, is worth 1544 1276 = 268.
Example 5.7: Goal-Programming a Product Mix Problem
This example shows how to use PROC LP to solve a linear goal-programming problem. PROC
LP has the ability to solve a series of linear programs, each with a new objective function. These
objective functions are ordered by priority. The rst step is to solve a linear program with the
highest priority objective function constrained only by the formal constraints in the model. Then,
the problem with the next highest priority objective function is solved, constrained by the formal
constraints in the model and by the value that the highest priority objective function realized. That
is, the second problem optimizes the second highest priority objective function among the alternate
optimal solutions to the rst optimization problem. The process continues until a linear program is
solved for each of the objectives.
This technique is useful for differentiating among alternate optimal solutions to a linear program.
It also ts into the formal paradigm presented in goal programming. In goal programming, the
objective functions typically take on the role of driving a linear function of the structural variables
to meet a target level as closely as possible. The details of this can be found in many books on the
subject, including Ignizio (1976).
264 ! Chapter 5: The LP Procedure
Consider the following problem taken from Ignizio (1976). A small paint company manufactures two
types of paint, latex and enamel. In production, the company uses 10 hours of labor to produce 100
gallons of latex and 15 hours of labor to produce 100 gallons of enamel. Without hiring outside help
or requiring overtime, the company has 40 hours of labor available each week. Furthermore, each
paint generates a prot at the rate of $1.00 per gallon. The company has the following objectives
listed in decreasing priority:
v avoid the use of overtime
v achieve a weekly prot of $1000
v produce at least 700 gallons of enamel paint each week
The program to solve this problem follows.
data object;
input _row_ $ latex enamel n1 n2 n3 p1 p2 p3 _type_ $ _rhs_;
datalines;
overtime . . . . . 1 . . min 1
profit . . . 1 . . . . min 2
enamel . . . . 1 . . . min 3
overtime 10 15 1 . . -1 . . eq 40
profit 100 100 . 1 . . -1 . eq 1000
enamel . 1 . . 1 . . -1 eq 7
;
proc lp data=object goalprogram;
run;
The data set called OBJECT contains the model. Its rst three observations are the objective rows,
and the next three observations are the constraints. The values in the right-hand-side variable _RHS_
in the objective rows give the priority of the objectives. The objective in the rst observation with
_ROW_=OVERTIME has the highest priority, the objective named PROFIT has the next highest,
and the objective named ENAMEL has the lowest. Note that the value of the right-hand-side variable
determines the priority, not the order, in the data set.
Because this example is set in the formal goal-programming scheme, the model has structural
variables representing negative (n1, n2, and n3) and positive (p1, p2, and p3) deviations from target
levels. For example, n1+p1 is the deviation from the objective of avoiding the use of overtime and
underusing the normal work time, namely using exactly 40 work hours. The other objectives are
handled similarly.
Notice that the PROC LP statement includes the GOALPROGRAM option. Without this option, the
procedure would solve three separate problems: one for each of the three objective functions. In that
case, however, the procedure would not constrain the second and third programs using the results of
the preceding programs; also, the values 1, 2, and 3 for _RHS_ in the objective rows would have no
effect.
Output 5.7.1 shows the solution of the goal program, apparently as three linear program outputs.
However, examination of the constraint summaries in the second and third problems shows that the
Example 5.7: Goal-Programming a Product Mix Problem ! 265
constraints labeled by the objectives OVERTIME and PROFIT have type FIXEDOBJ. This indicates
that these objective rows have become constraints in the subsequent problems.
Output 5.7.1 Goal Programming
The LP Procedure
Problem Summary
Objective Function Min overtime
Rhs Variable _rhs_
Type Variable _type_
Problem Density (%) 45.83
Variables Number
Non-negative 8
Total 8
Constraints Number
EQ 3
Objective 3
Total 6
Solution Summary
Terminated Successfully
Objective Value 0
Phase 1 Iterations 2
Phase 2 Iterations 0
Phase 3 Iterations 0
Integer Iterations 0
Integer Solutions 0
Initial Basic Feasible Variables 7
Time Used (seconds) 0
Number of Inversions 2
Epsilon 1E-8
Infinity 1.797693E308
Maximum Phase 1 Iterations 100
Maximum Phase 2 Iterations 100
Maximum Phase 3 Iterations 99999999
Maximum Integer Iterations 100
Time Limit (seconds) 120
266 ! Chapter 5: The LP Procedure
Output 5.7.1 continued
Variable Summary
Variable Reduced
Col Name Status Type Price Activity Cost
1 latex ALTER NON-NEG 0 0 0
2 enamel ALTER NON-NEG 0 0 0
3 n1 BASIC NON-NEG 0 40 0
4 n2 BASIC NON-NEG 0 1000 0
5 n3 BASIC NON-NEG 0 7 0
6 p1 NON-NEG 1 0 1
7 p2 ALTER NON-NEG 0 0 0
8 p3 ALTER NON-NEG 0 0 0
Constraint Summary
Constraint S/S Dual
Row Name Type Col Rhs Activity Activity
1 overtime OBJECTVE . 0 0 .
2 profit FREE_OBJ . 0 1000 .
3 enamel FREE_OBJ . 0 7 .
4 overtime EQ . 40 40 0
5 profit EQ . 1000 1000 0
6 enamel EQ . 7 7 0
Problem Summary
Objective Function Min profit
Rhs Variable _rhs_
Type Variable _type_
Problem Density (%) 45.83
Variables Number
Non-negative 8
Total 8
Constraints Number
EQ 3
Objective 3
Total 6
Example 5.7: Goal-Programming a Product Mix Problem ! 267
Output 5.7.1 continued
Solution Summary
Terminated Successfully
Objective Value 600
Phase 1 Iterations 0
Phase 2 Iterations 3
Phase 3 Iterations 0
Integer Iterations 0
Integer Solutions 0
Initial Basic Feasible Variables 7
Time Used (seconds) 0
Number of Inversions 5
Epsilon 1E-8
Infinity 1.797693E308
Maximum Phase 1 Iterations 100
Maximum Phase 2 Iterations 100
Maximum Phase 3 Iterations 99999999
Maximum Integer Iterations 100
Time Limit (seconds) 120
Variable Summary
Variable Reduced
Col Name Status Type Price Activity Cost
1 latex BASIC NON-NEG 0 4 0
2 enamel NON-NEG 0 0 50
3 n1 NON-NEG 0 0 10
4 n2 BASIC NON-NEG 1 600 0
5 n3 BASIC NON-NEG 0 7 0
6 p1 DEGEN NON-NEG 0 0 0
7 p2 NON-NEG 0 0 1
8 p3 ALTER NON-NEG 0 0 0
Constraint Summary
Constraint S/S Dual
Row Name Type Col Rhs Activity Activity
1 overtime FIXEDOBJ . 0 0 .
2 profit OBJECTVE . 0 600 .
3 enamel FREE_OBJ . 0 7 .
4 overtime EQ . 40 40 -10
5 profit EQ . 1000 1000 1
6 enamel EQ . 7 7 0
268 ! Chapter 5: The LP Procedure
Output 5.7.1 continued
Problem Summary
Objective Function Min enamel
Rhs Variable _rhs_
Type Variable _type_
Problem Density (%) 45.83
Variables Number
Non-negative 8
Total 8
Constraints Number
EQ 3
Objective 3
Total 6
Solution Summary
Terminated Successfully
Objective Value 7
Phase 1 Iterations 0
Phase 2 Iterations 1
Phase 3 Iterations 0
Integer Iterations 0
Integer Solutions 0
Initial Basic Feasible Variables 7
Time Used (seconds) 0
Number of Inversions 8
Epsilon 1E-8
Infinity 1.797693E308
Maximum Phase 1 Iterations 100
Maximum Phase 2 Iterations 100
Maximum Phase 3 Iterations 99999999
Maximum Integer Iterations 100
Time Limit (seconds) 120
Example 5.7: Goal-Programming a Product Mix Problem ! 269
Output 5.7.1 continued
Variable Summary
Variable Reduced
Col Name Status Type Price Activity Cost
1 latex BASIC NON-NEG 0 4 0
2 enamel DEGEN NON-NEG 0 0 0
3 n1 NON-NEG 0 0 0.2
4 n2 BASIC NON-NEG 0 600 0
5 n3 BASIC NON-NEG 1 7 0
6 p1 DEGEN NON-NEG 0 0 0
7 p2 NON-NEG 0 0 0.02
8 p3 NON-NEG 0 0 1
Constraint Summary
Constraint S/S Dual
Row Name Type Col Rhs Activity Activity
1 overtime FIXEDOBJ . 0 0 .
2 profit FIXEDOBJ . 0 600 .
3 enamel OBJECTVE . 0 7 .
4 overtime EQ . 40 40 -0.2
5 profit EQ . 1000 1000 0.02
6 enamel EQ . 7 7 1
The solution to the last linear program shows a value of 4 for the variable LATEX and a value of 0 for
the variable ENAMEL. This tells you that the solution to the linear goal program is to produce 400
gallons of latex and no enamel paint.
The values of the objective functions in the three linear programs tell you whether you can achieve
the three objectives. The activities of the constraints labeled OVERTIME, PROFIT, and ENAMEL
tell you values of the three linear program objectives. Because the rst linear programming objective
OVERTIME is 0, the highest priority objective, which is to avoid using additional labor, is accom-
plished. However, because the second and third objectives are nonzero, the second and third priority
objectives are not satised completely. The PROFIT objective is 600. Because the PROFIT objective
is to minimize the negative deviation from the prot constraint, this means that a prot of only 400 =
1000 600 is realized. Similarly, the ENAMEL objective is 7, indicating that there is a negative
deviation from the ENAMEL target of 7 units.
270 ! Chapter 5: The LP Procedure
Example 5.8: A Simple Integer Program
Recall the linear programming problem presented in Chapter 3, Introduction to Optimization. In
that problem, a rm produces two products, chocolates and gumdrops, that are processed by four
processes: cooking, color/avor, condiments, and packaging. The objective is to determine the
product mix that maximizes the prot to the rm while not exceeding manufacturing capacities. The
problem is extended to demonstrate a use of integer-constrained variables.
Suppose that you must manufacture only one of the two products. In addition, there is a setup cost of
100 if you make the chocolates and 75 if you make the gumdrops. To identify which product will
maximize prot, you dene two zero-one integer variables, ICHOCO and IGUMDR, and you also dene
two new constraints, CHOCOLATE and GUM. The constraint labeled CHOCOLATE forces ICHOCO to
equal one when chocolates are manufactured. Similarly, the constraint labeled GUM forces IGUMDR to
equal 1 when gumdrops are manufactured. Also, you should include a constraint labeled ONLY_ONE
that requires the sum of ICHOCO and IGUMDR to equal 1. (Note that this could be accomplished more
simply by including ICHOCO and IGUMDR in a SOSEQ set.) Since ICHOCO and IGUMDR are integer
variables, this constraint eliminates the possibility of both products being manufactured. Notice the
coefcients -10000, which are used to force ICHOCO or IGUMDR to 1 whenever CHOCO and GUMDR
are nonzero. This technique, which is often used in integer programming, can cause severe numerical
problems. If this driving coefcient is too large, then arithmetic overows and underow may result.
If the driving coefcient is too small, then the integer variable may not be driven to 1 as desired by
the modeler.
The objective coefcients of the integer variables ICHOCO and IGUMDR are the negatives of the setup
costs for the two products. The following is the data set that describes this problem and the call to
PROC LP to solve it:
data;
format _row_ $10. ;
input _row_ $ choco gumdr ichoco igumdr _type_ $ _rhs_;
datalines;
object .25 .75 -100 -75 max .
cooking 15 40 0 0 le 27000
color 0 56.25 0 0 le 27000
package 18.75 0 0 0 le 27000
condiments 12 50 0 0 le 27000
chocolate 1 0 -10000 0 le 0
gum 0 1 0 -10000 le 0
only_one 0 0 1 1 eq 1
binary . . 1 2 binary .
;
proc lp;
run;
The solution shows that gumdrops are produced. See Output 5.8.1.
Example 5.8: A Simple Integer Program ! 271
Output 5.8.1 Summaries and an Integer Programming Iteration Log
The LP Procedure
Problem Summary
Objective Function Max object
Rhs Variable _rhs_
Type Variable _type_
Problem Density (%) 25.71
Variables Number
Non-negative 2
Binary 2
Slack 6
Total 10
Constraints Number
LE 6
EQ 1
Objective 1
Total 8
Integer Iteration Log
Iter Problem Condition Objective Branched Value Sinfeas Active Proximity
1 0 ACTIVE 397.5 ichoco 0.1 0.2 2 .
2 -1 SUBOPTIMAL 260 . . . 1 70
3 1 SUBOPTIMAL 285 . . . 0 .
Solution Summary
Integer Optimal Solution
Objective Value 285
Phase 1 Iterations 0
Phase 2 Iterations 5
Phase 3 Iterations 5
Integer Iterations 3
Integer Solutions 2
Initial Basic Feasible Variables 9
Time Used (seconds) 0
Number of Inversions 5
Epsilon 1E-8
Infinity 1.797693E308
Maximum Phase 1 Iterations 100
Maximum Phase 2 Iterations 100
Maximum Phase 3 Iterations 99999999
Maximum Integer Iterations 100
Time Limit (seconds) 120
272 ! Chapter 5: The LP Procedure
Output 5.8.1 continued
Variable Summary
Variable Reduced
Col Name Status Type Price Activity Cost
1 choco DEGEN NON-NEG 0.25 0 0
2 gumdr BASIC NON-NEG 0.75 480 0
3 ichoco BINARY -100 0 2475
4 igumdr BASIC BINARY -75 1 0
5 cooking BASIC SLACK 0 7800 0
6 color SLACK 0 0 -0.013333
7 package BASIC SLACK 0 27000 0
8 condiments BASIC SLACK 0 3000 0
9 chocolate SLACK 0 0 -0.25
10 gum BASIC SLACK 0 9520 0
Constraint Summary
Constraint S/S Dual
Row Name Type Col Rhs Activity Activity
1 object OBJECTVE . 0 285 .
2 cooking LE 5 27000 19200 0
3 color LE 6 27000 27000 0.0133333
4 package LE 7 27000 0 0
5 condiments LE 8 27000 24000 0
6 chocolate LE 9 0 0 0.25
7 gum LE 10 0 -9520 0
8 only_one EQ . 1 1 -75
The branch-and-bound tree can be reconstructed from the information contained in the integer
iteration log. The column labeled Iter numbers the integer iterations. The column labeled Problem
identies the Iter number of the parent problem from which the current problem is dened. For
example, Iter=2 has Problem=-1. This means that problem 2 is a direct descendant of problem 1.
Furthermore, because problem 1 branched on ICHOCO, you know that problem 2 is identical to
problem 1 with an additional constraint on variable ICHOCO. The minus sign in the Problem=-1 in
Iter=2 tells you that the new constraint on variable ICHOCO is a lower bound. Moreover, because
Value=0.1 in Iter=1, you know that ICHOCO=0.1 in Iter=1 so that the added constraint in Iter=2 is
ICHOCO _ {0.1. In this way, the information in the log can be used to reconstruct the branch-and-
bound tree. In fact, when you save an ACTIVEOUT= data set, it contains information in this format
that is used to reconstruct the tree when you restart a problem using the ACTIVEIN= data set. See
Example 5.10.
Note that if you dened a SOSEQ special ordered set containing the variables CHOCO and GUMDR,
the integer variables ICHOCO and IGUMDR and the three associated constraints would not have been
needed.
Example 5.9: An Infeasible Problem ! 273
Example 5.9: An Infeasible Problem
This is an example of the Infeasible Information Summary that is displayed when an infeasible
problem is encountered. Consider the following problem:
max . , : n
subject to . 3, 2: 4n _ 5
3. , 2: n _ 4
5. 3, 3: 3n = 9
.. ,. :. n _ 0
Examination of this problem reveals that it is unsolvable. Consequently, PROC LP identies it as
infeasible. The following program attempts to solve it.
data infeas;
format _id_ $6.;
input _id_ $ x1-x4 _type_ $ _rhs_;
datalines;
profit 1 1 1 1 max .
const1 1 3 2 4 le 5
const2 3 1 2 1 le 4
const3 5 3 3 3 eq 9
;
The results are shown in Output 5.9.1.
Output 5.9.1 The Solution of an Infeasible Problem
The LP Procedure
Problem Summary
Objective Function Max profit
Rhs Variable _rhs_
Type Variable _type_
Problem Density (%) 77.78
Variables Number
Non-negative 4
Slack 2
Total 6
Constraints Number
LE 2
EQ 1
Objective 1
Total 4
274 ! Chapter 5: The LP Procedure
ERROR: Infeasible problem. Note the constraints in the constraint summary
that are identified as infeasible. If none of the constraints are
flagged then check the implicit bounds on the variables.
The LP Procedure
Solution Summary
Infeasible Problem
Objective Value 2.5
Phase 1 Iterations 2
Phase 2 Iterations 0
Phase 3 Iterations 0
Integer Iterations 0
Integer Solutions 0
Initial Basic Feasible Variables 5
Time Used (seconds) 0
Number of Inversions 2
Epsilon 1E-8
Infinity 1.797693E308
Maximum Phase 1 Iterations 100
Maximum Phase 2 Iterations 100
Maximum Phase 3 Iterations 99999999
Maximum Integer Iterations 100
Time Limit (seconds) 120
The LP Procedure
Variable Summary
Variable Reduced
Col Name Status Type Price Activity Cost
1 x1 BASIC NON-NEG 1 0.75 0
2 x2 BASIC NON-NEG 1 1.75 0
3 x3 NON-NEG 1 0 0.5
4 x4 NON-NEG 1 0 0
*
INF
*
const1 BASIC SLACK 0 -1 0
6 const2 SLACK 0 0 0.5
Example 5.10: Restarting an Integer Program ! 275
The LP Procedure
Constraint Summary
Constraint S/S Dual
Row Name Type Col Rhs Activity Activity
1 profit OBJECTVE . 0 2.5 .
*
INF
*
const1 LE 5 5 6 0
3 const2 LE 6 4 4 -0.5
4 const3 EQ . 9 9 0.5
The LP Procedure
Infeasible Information Summary
Infeasible Row const1
Constraint Activity 6
Row Type LE
Rhs Data 5
Lower Upper
Variable Coefficient Activity Bound Bound
x1 1 0.75 0 INFINITY
x2 3 1.75 0 INFINITY
x3 2 0 0 INFINITY
x4 4 0 0 INFINITY
Note the information given in the Infeasible Information Summary for the infeasible row CONST1.
It shows that the inequality row CONST1 with right-hand side 5 was found to be infeasible with
activity 6. The summary also shows each variable that has a nonzero coefcient in that row and its
activity level at the infeasibility. Examination of these model parameters might give you a clue as to
the cause of infeasibility, such as an incorrectly entered coefcient or right-hand-side value.
Example 5.10: Restarting an Integer Program
The following example is attributed to Haldi (Garnkel and Nemhauser 1972) and is used in the
literature as a test problem. Notice that the ACTIVEOUT= and the PRIMALOUT= options are used
when invoking PROC LP. These cause the LP procedure to save the primal solution in the data set
named P and the active tree in the data set named A. If the procedure fails to nd an optimal integer
solution on the initial call, it can be called later using the A and P data sets as starting information.
276 ! Chapter 5: The LP Procedure
data haldi10;
input x1-x12 _type_ $ _rhs_;
datalines;
0 0 0 0 0 0 1 1 1 1 1 1 MAX .
9 7 16 8 24 5 3 7 8 4 6 5 LE 110
12 6 6 2 20 8 4 6 3 1 5 8 LE 95
15 5 12 4 4 5 5 5 6 2 1 5 LE 80
18 4 4 18 28 1 6 4 2 9 7 1 LE 100
-12 0 0 0 0 0 1 0 0 0 0 0 LE 0
0 -15 0 0 0 0 0 1 0 0 0 0 LE 0
0 0 -12 0 0 0 0 0 1 0 0 0 LE 0
0 0 0 -10 0 0 0 0 0 1 0 0 LE 0
0 0 0 0 -11 0 0 0 0 0 1 0 LE 0
0 0 0 0 0 -11 0 0 0 0 0 1 LE 0
1 1 1 1 1 1 1000 1000 1000 1000 1000 1000 UPPERBD .
1 2 3 4 5 6 7 8 9 10 11 12 INTEGER .
;
The ACTIVEOUT= data set contains a representation of the current active problems in the branch-
and-bound tree. The PRIMALOUT= data set contains a representation of the solution to the current
problem. These two can be used to restore the procedure to an equivalent state to the one it was in
when it stopped.
The results from the call to PROC LP is shown in Output 5.10.1. Notice that the procedure performed
100 iterations and then terminated on maximum integer iterations. This is because, by default,
IMAXIT=100. The procedure reports the current best integer solution.
Output 5.10.1 Output from the HALDI10 Problem
The LP Procedure
Problem Summary
Objective Function Max _OBS1_
Rhs Variable _rhs_
Type Variable _type_
Problem Density (%) 31.82
Variables Number
Integer 6
Binary 6
Slack 10
Total 22
Constraints Number
LE 10
Objective 1
Total 11
Example 5.10: Restarting an Integer Program ! 277
The LP Procedure
Integer Iteration Log
Iter Problem Condition Objective Branched Value Sinfeas Active Proximity
1 0 ACTIVE 18.709524 x9 1.543 1.11905 2 .
2 1 ACTIVE 18.467723 x12 9.371 0.88948 3 .
3 2 ACTIVE 18.460133 x8 0.539 1.04883 4 .
4 -3 ACTIVE 18.453638 x12 8.683 1.12993 5 .
5 4 ACTIVE 18.439678 x10 7.448 1.20125 6 .
6 5 ACTIVE 18.403728 x6 0.645 1.3643 7 .
7 -6 ACTIVE 18.048289 x4 0.7 1.18395 8 .
8 -7 ACTIVE 17.679087 x8 1.833 0.52644 9 .
9 8 ACTIVE 17.52 x10 6.667 0.70111 10 .
10 9 ACTIVE 17.190085 x12 7.551 1.37615 11 .
11 -10 ACTIVE 17.02 x1 0.085 0.255 12 .
12 11 ACTIVE 16.748 x11 0.748 0.47 13 .
13 -12 ACTIVE 16.509091 x9 0.509 0.69091 14 .
14 13 ACTIVE 16.261333 x11 1.261 0.44267 15 .
15 14 ACTIVE 16 x3 0.297 0.45455 16 .
16 15 ACTIVE 16 x5 0.091 0.15758 16 .
17 -16 INFEASIBLE -0.4 . . . 15 .
18 -15 ACTIVE 11.781818 x10 1.782 0.37576 15 .
19 18 ACTIVE 11 x5 0.091 0.15758 15 .
20 -19 INFEASIBLE -6.4 . . . 14 .
21 -14 ACTIVE 11.963636 x5 0.182 0.28485 14 .
22 -21 INFEASIBLE -4.4 . . . 13 .
23 -13 ACTIVE 15.281818 x10 4.282 0.52273 13 .
24 23 ACTIVE 15.041333 x5 0.095 0.286 14 .
25 -24 INFEASIBLE -2.9 . . . 13 .
26 24 INFEASIBLE 14 . . . 12 .
27 12 ACTIVE 16 x3 0.083 0.15 13 .
28 -27 ACTIVE 15.277778 x9 0.278 0.34444 14 .
29 -28 ACTIVE 13.833333 x10 3.833 0.23333 14 .
30 29 ACTIVE 13 x2 0.4 0.4 15 .
31 30 INFEASIBLE 12 . . . 14 .
32 -30 SUBOPTIMAL 10 . . . 13 8
33 28 ACTIVE 15 x2 0.067 0.06667 13 8
34 -33 SUBOPTIMAL 12 . . . 12 6
35 27 ACTIVE 15 x2 0.067 0.06667 12 6
36 -35 SUBOPTIMAL 15 . . . 11 3
37 -11 FATHOMED 14.275 . . . 10 3
38 10 ACTIVE 16.804848 x1 0.158 0.50313 11 3
39 -38 FATHOMED 14.784 . . . 10 3
40 38 ACTIVE 16.40381 x11 1.404 0.68143 11 3
41 -40 ACTIVE 16.367677 x10 5.368 0.69949 12 3
42 41 ACTIVE 16.113203 x11 2.374 1.00059 12 3
43 42 ACTIVE 16 x5 0.182 0.33182 12 3
44 -43 FATHOMED 13.822222 . . . 11 3
278 ! Chapter 5: The LP Procedure
The LP Procedure
Integer Iteration Log
Iter Problem Condition Objective Branched Value Sinfeas Active Proximity
45 -41 FATHOMED 12.642424 . . . 10 3
46 40 ACTIVE 16 x5 0.229 0.37857 10 3
47 46 FATHOMED 15 . . . 9 3
48 -9 ACTIVE 17.453333 x7 0.453 0.64111 10 3
49 48 ACTIVE 17.35619 x11 0.356 0.53857 11 3
50 49 ACTIVE 17 x5 0.121 0.27143 12 3
51 50 ACTIVE 17 x3 0.083 0.15 13 3
52 -51 FATHOMED 15.933333 . . . 12 3
53 51 ACTIVE 16 x2 0.067 0.06667 12 3
54 -53 SUBOPTIMAL 16 . . . 8 2
55 -8 ACTIVE 17.655399 x12 7.721 0.56127 9 2
56 55 ACTIVE 17.519375 x10 6.56 0.76125 10 2
57 56 ACTIVE 17.256874 x2 0.265 0.67388 11 2
58 57 INFEASIBLE 17.167622 . . . 10 2
59 -57 FATHOMED 16.521755 . . . 9 2
60 -56 FATHOMED 17.03125 . . . 8 2
61 -55 ACTIVE 17.342857 x9 0.343 0.50476 8 2
62 61 ACTIVE 17.2225 x7 0.16 0.37333 9 2
63 62 ACTIVE 17.1875 x8 2.188 0.33333 9 2
64 63 ACTIVE 17.153651 x11 0.154 0.30095 10 2
65 -64 FATHOMED 12.381818 . . . 9 2
66 64 ACTIVE 17 x2 0.133 0.18571 9 2
67 -66 FATHOMED 13 . . . 8 2
68 -62 FATHOMED 14.2 . . . 7 2
69 7 FATHOMED 15.428583 . . . 6 2
70 6 FATHOMED 16.75599 . . . 5 2
71 -5 ACTIVE 17.25974 x6 0.727 0.82078 5 2
72 -71 FATHOMED 17.142857 . . . 4 2
73 -4 ACTIVE 18.078095 x4 0.792 0.70511 5 2
74 -73 ACTIVE 17.662338 x10 7.505 0.91299 5 2
75 74 ACTIVE 17.301299 x9 0.301 0.57489 5 2
76 75 ACTIVE 17.210909 x7 0.211 0.47697 5 2
77 76 FATHOMED 17.164773 . . . 4 2
78 73 FATHOMED 12.872727 . . . 3 2
79 3 ACTIVE 18.368316 x10 7.602 1.20052 4 2
80 79 ACTIVE 18.198323 x7 1.506 1.85351 5 2
81 80 ACTIVE 18.069847 x12 8.517 1.67277 6 2
82 -81 ACTIVE 17.910909 x4 0.7 0.73015 7 2
83 -82 ACTIVE 17.790909 x7 0.791 0.54015 8 2
84 -83 ACTIVE 17.701299 x9 0.701 0.62229 8 2
85 84 ACTIVE 17.17619 x6 0.818 0.45736 8 2
86 -85 ACTIVE 17.146667 x11 0.147 0.24333 8 2
87 86 ACTIVE 17 x1 0.167 0.16667 8 2
88 87 INFEASIBLE 16 . . . 7 2
Example 5.10: Restarting an Integer Program ! 279
The LP Procedure
Integer Iteration Log
Iter Problem Condition Objective Branched Value Sinfeas Active Proximity
89 83 ACTIVE 17.58 x11 0.58 0.73788 8 2
90 -89 FATHOMED 17.114286 . . . 7 2
91 -80 ACTIVE 18.044048 x12 8.542 1.71158 8 2
92 91 ACTIVE 17.954536 x11 0.477 1.90457 9 2
93 92 ACTIVE 17.875084 x4 0.678 1.16624 10 2
94 93 FATHOMED 13.818182 . . . 9 2
95 -93 ACTIVE 17.231221 x6 0.727 0.76182 9 2
96 -95 FATHOMED 17.085714 . . . 8 2
97 -92 FATHOMED 17.723058 . . . 7 2
98 -91 FATHOMED 16.378788 . . . 6 2
99 89 ACTIVE 17 x6 0.818 0.26515 6 2
100 -99 ACTIVE 17 x3 0.083 0.08333 6 2
WARNING: The maximum number of integer iterations has been exceeded. Increase
this limit with the 'IMAXIT=' option on the RESET statement.
The LP Procedure
Solution Summary
Terminated on Maximum Integer Iterations
Integer Feasible Solution
Objective Value 16
Phase 1 Iterations 0
Phase 2 Iterations 13
Phase 3 Iterations 161
Integer Iterations 100
Integer Solutions 4
Initial Basic Feasible Variables 12
Time Used (seconds) 0
Number of Inversions 37
Epsilon 1E-8
Infinity 1.797693E308
Maximum Phase 1 Iterations 100
Maximum Phase 2 Iterations 100
Maximum Phase 3 Iterations 99999999
Maximum Integer Iterations 100
Time Limit (seconds) 120
280 ! Chapter 5: The LP Procedure
The LP Procedure
Variable Summary
Variable Reduced
Col Name Status Type Price Activity Cost
1 x1 DEGEN BINARY 0 0 0
2 x2 ALTER BINARY 0 1 0
3 x3 BINARY 0 0 12
4 x4 ALTER BINARY 0 1 0
5 x5 ALTER BINARY 0 0 0
6 x6 ALTER BINARY 0 1 0
7 x7 INTEGER 1 0 1
8 x8 INTEGER 1 1 1
9 x9 DEGEN INTEGER 1 0 0
10 x10 INTEGER 1 7 1
11 x11 INTEGER 1 0 1
12 x12 INTEGER 1 8 1
13 _OBS2_ BASIC SLACK 0 15 0
14 _OBS3_ BASIC SLACK 0 2 0
15 _OBS4_ BASIC SLACK 0 7 0
16 _OBS5_ BASIC SLACK 0 2 0
17 _OBS6_ ALTER SLACK 0 0 0
18 _OBS7_ BASIC SLACK 0 14 0
19 _OBS8_ SLACK 0 0 -1
20 _OBS9_ BASIC SLACK 0 3 0
21 _OBS10_ DEGEN SLACK 0 0 0
22 _OBS11_ BASIC SLACK 0 3 0
The LP Procedure
Constraint Summary
Constraint S/S Dual
Row Name Type Col Rhs Activity Activity
1 _OBS1_ OBJECTVE . 0 16 .
2 _OBS2_ LE 13 110 95 0
3 _OBS3_ LE 14 95 93 0
4 _OBS4_ LE 15 80 73 0
5 _OBS5_ LE 16 100 98 0
6 _OBS6_ LE 17 0 0 0
7 _OBS7_ LE 18 0 -14 0
8 _OBS8_ LE 19 0 0 1
9 _OBS9_ LE 20 0 -3 0
10 _OBS10_ LE 21 0 0 0
11 _OBS11_ LE 22 0 -3 0
To continue with the solution of this problem, invoke PROC LP with the ACTIVEIN= and PRIMA-
LIN= options and reset the IMAXIT= option. This restores the branch-and-bound tree and simplies
calculating a basic feasible solution from which to start processing.
Example 5.11: Alternative Search of the Branch-and-Bound Tree ! 281
proc lp data=haldi10 activein=a primalin=p imaxit=250;
run;
The procedure picks up iterating from a equivalent state to where it left off. The problem will still
not be solved when IMAXIT=250 occurs.
Example 5.11: Alternative Search of the Branch-and-Bound Tree
In this example, the HALDI10 problem from Example 5.10 is solved. However, here the default
strategy for searching the branch-and-bound tree is modied. By default, the search strategy has
VARSELECT=FAR. This means that when searching for an integer variable on which to branch,
the procedure uses the one that has a value farthest from an integer value. An alternative strategy
has VARSELECT=PENALTY. This strategy causes PROC LP to look at the cost, in terms of the
objective function, of branching on an integer variable. The procedure looks at PENALTYDEPTH=
integer variables before choosing the one with the largest cost. This is a much more expensive
strategy (in terms of execution time) than the VARSELECT=FAR strategy, but it can be benecial if
fewer integer iterations must be done to nd an optimal solution.
proc lp data=haldi10 varselect=penalty;
run;
Compare the number of integer iterations needed to solve the problem using this heuristic with
the default strategy used in Example 5.10. In this example, the difference is profound; in general,
solution times can vary signicantly with the search technique. See Output 5.11.1.
282 ! Chapter 5: The LP Procedure
Output 5.11.1 Summaries and an Integer Programming Iteration Log: Using
VARSELECT=PENALTY
The LP Procedure
Problem Summary
Objective Function Max _OBS1_
Rhs Variable _rhs_
Type Variable _type_
Problem Density (%) 31.82
Variables Number
Integer 6
Binary 6
Slack 10
Total 22
Constraints Number
LE 10
Objective 1
Total 11
Example 5.11: Alternative Search of the Branch-and-Bound Tree ! 283
Output 5.11.1 continued
Integer Iteration Log
Iter Problem Condition Objective Branched Value Sinfeas Active Proximity
1 0 ACTIVE 18.709524 x4 0.8 1.11905 2 .
2 1 ACTIVE 16.585187 x1 0.447 2.33824 3 .
3 2 ACTIVE 14.86157 x5 0.221 2.09584 4 .
4 3 ACTIVE 14.807195 x2 0.897 1.31729 5 .
5 -4 ACTIVE 14.753205 x8 14.58 0.61538 6 .
6 5 ACTIVE 14.730078 x6 0.043 0.79446 7 .
7 -6 ACTIVE 13.755102 x3 0.051 0.58163 8 .
8 -7 ACTIVE 11.6 x8 11.6 0.4 9 .
9 8 ACTIVE 11.6 x12 0.6 0.4 10 .
10 -9 ACTIVE 11.6 x8 10.6 0.4 11 .
11 10 ACTIVE 11.6 x12 1.6 0.4 12 .
12 -11 ACTIVE 11.6 x8 9.6 0.4 13 .
13 12 ACTIVE 11.6 x12 2.6 0.4 14 .
14 -13 ACTIVE 11.571429 x9 0.143 0.57143 15 .
15 14 ACTIVE 11.5 x8 8.5 0.5 16 .
16 -15 INFEASIBLE 9 . . . 15 .
17 15 ACTIVE 11.375 x12 3.375 0.375 16 .
18 -17 ACTIVE 11.166667 x8 7.167 0.16667 17 .
19 18 ACTIVE 11.125 x12 4.125 0.125 18 .
20 19 SUBOPTIMAL 11 . . . 7 7
21 7 ACTIVE 13.5 x8 13.5 0.5 8 7
22 -21 INFEASIBLE 11 . . . 7 7
23 21 ACTIVE 13.375 x12 0.375 0.375 8 7
24 -23 ACTIVE 13.166667 x8 12.17 0.16667 9 7
25 24 ACTIVE 13.125 x12 1.125 0.125 10 7
26 25 SUBOPTIMAL 13 . . . 4 5
27 6 ACTIVE 14.535714 x3 0.045 0.50893 5 5
28 -27 FATHOMED 12.625 . . . 4 5
29 27 SUBOPTIMAL 14 . . . 1 4
30 -1 ACTIVE 18.309524 x3 0.129 1.31905 2 4
31 30 ACTIVE 17.67723 x6 0.886 0.43662 3 4
32 31 ACTIVE 15.485156 x2 0.777 1.50833 4 4
33 -32 ACTIVE 15.2625 x1 0.121 1.38333 4 4
34 33 ACTIVE 15.085106 x10 3.532 0.91489 4 4
35 34 FATHOMED 14.857143 . . . 3 4
36 32 FATHOMED 11.212121 . . . 2 4
37 -31 ACTIVE 17.56338 x10 7.93 0.43662 3 4
38 37 ACTIVE 17.225962 x8 2.38 0.69231 4 4
39 38 ACTIVE 17.221818 x1 0.016 0.37111 5 4
40 -39 FATHOMED 14.43662 . . . 4 4
41 39 ACTIVE 17.172375 x2 0.133 0.31948 5 4
42 41 ACTIVE 16.890196 x5 0.086 0.19608 6 4
43 42 ACTIVE 16.75 x12 9.75 0.25 7 4
44 -43 SUBOPTIMAL 15 . . . 6 3
45 43 SUBOPTIMAL 16 . . . 3 2
46 -38 FATHOMED 17.138028 . . . 2 2
284 ! Chapter 5: The LP Procedure
Output 5.11.1 continued
Integer Iteration Log
Iter Problem Condition Objective Branched Value Sinfeas Active Proximity
47 -37 SUBOPTIMAL 17 . . . 1 1
48 -30 FATHOMED 16.566667 . . . 0 .
Solution Summary
Integer Optimal Solution
Objective Value 17
Phase 1 Iterations 0
Phase 2 Iterations 13
Phase 3 Iterations 79
Integer Iterations 48
Integer Solutions 6
Initial Basic Feasible Variables 12
Time Used (seconds) 0
Number of Inversions 17
Epsilon 1E-8
Infinity 1.797693E308
Maximum Phase 1 Iterations 100
Maximum Phase 2 Iterations 100
Maximum Phase 3 Iterations 99999999
Maximum Integer Iterations 100
Time Limit (seconds) 120
Example 5.11: Alternative Search of the Branch-and-Bound Tree ! 285
Output 5.11.1 continued
Variable Summary
Variable Reduced
Col Name Status Type Price Activity Cost
1 x1 DEGEN BINARY 0 0 0
2 x2 BINARY 0 0 -4
3 x3 BINARY 0 0 -4
4 x4 BINARY 0 1 -18
5 x5 DEGEN BINARY 0 0 0
6 x6 BINARY 0 1 -1
7 x7 INTEGER 1 0 -6.5
8 x8 INTEGER 1 0 -3
9 x9 INTEGER 1 0 -1
10 x10 INTEGER 1 8 -8
11 x11 INTEGER 1 0 -8.545455
12 x12 BASIC INTEGER 1 9 0
13 _OBS2_ BASIC SLACK 0 20 0
14 _OBS3_ BASIC SLACK 0 5 0
15 _OBS4_ BASIC SLACK 0 10 0
16 _OBS5_ SLACK 0 0 -1
17 _OBS6_ SLACK 0 0 -1.5
18 _OBS7_ DEGEN SLACK 0 0 0
19 _OBS8_ DEGEN SLACK 0 0 0
20 _OBS9_ BASIC SLACK 0 2 0
21 _OBS10_ SLACK 0 0 -2.545455
22 _OBS11_ BASIC SLACK 0 2 0
Constraint Summary
Constraint S/S Dual
Row Name Type Col Rhs Activity Activity
1 _OBS1_ OBJECTVE . 0 17 .
2 _OBS2_ LE 13 110 90 0
3 _OBS3_ LE 14 95 90 0
4 _OBS4_ LE 15 80 70 0
5 _OBS5_ LE 16 100 100 1
6 _OBS6_ LE 17 0 0 1.5
7 _OBS7_ LE 18 0 0 0
8 _OBS8_ LE 19 0 0 0
9 _OBS9_ LE 20 0 -2 0
10 _OBS10_ LE 21 0 0 2.5454545
11 _OBS11_ LE 22 0 -2 0
Although the VARSELECT=PENALTY strategy works well in this example, there is no guarantee
that it will work well with your model. Experimentation with various strategies is necessary to nd
the one that works well with your model and data, particularly if a model is solved repeatedly with
few changes to either the structure or the data.
286 ! Chapter 5: The LP Procedure
Example 5.12: An Assignment Problem
This example departs somewhat from the emphasis of previous ones. Typically, linear programming
models are large, have considerable structure, and are solved with some regularity. Some form of
automatic model building, or matrix generation as it is commonly called, is a useful aid. The sparse
input format provides a great deal of exibility in model specication so that, in many cases, the
DATA step can be used to generate the matrix.
The following assignment problem illustrates some techniques in matrix generation. In this example,
you have four machines that can produce any of six grades of cloth, and you have ve customers
that demand various amounts of each grade of cloth. The return from supplying a customer with a
demanded grade depends on the machine on which the cloth was made. In addition, the machine
capacity depends both upon the specic machine used and the grade of cloth made.
To formulate this problem, let i denote customer, denote grade, and k denote machine. Then let
.
i}k
denote the amount of cloth of grade made on machine k for customer i ; let r
i}k
denote the
return from selling one unit of grade cloth made on machine k to customer i ; let J
i}
denote the
demand for grade cloth by customer i ; let c
}k
denote the number of units of machine k required
to produce one unit of grade cloth; and let a
k
denote the number of units of machine k available.
Then, you get
max

i}k
r
i}k
.
i}k
subject to

k
.
i}k
= J
i}
for all i and

i}
c
}k
.
i}k
_ a
k
for all k
.
i}k
_ 0 for all i. and k
The data are saved in three data sets. The OBJECT data set contains the returns for satisfying demand,
the DEMAND data set contains the amounts demanded, and the RESOURCE data set contains the
conversion factors for each grade and the total amounts of machine resources available.
title 'An Assignment Problem';
data object;
input machine customer
grade1 grade2 grade3 grade4 grade5 grade6;
datalines;
1 1 102 140 105 105 125 148
1 2 115 133 118 118 143 166
1 3 70 108 83 83 88 86
1 4 79 117 87 87 107 105
1 5 77 115 90 90 105 148
2 1 123 150 125 124 154 .
2 2 130 157 132 131 166 .
2 3 103 130 115 114 129 .
2 4 101 128 108 107 137 .
2 5 118 145 130 129 154 .
3 1 83 . . 97 122 147
3 2 119 . . 133 163 180
3 3 67 . . 91 101 101
3 4 85 . . 104 129 129
Example 5.12: An Assignment Problem ! 287
3 5 90 . . 114 134 179
4 1 108 121 79 . 112 132
4 2 121 132 92 . 130 150
4 3 78 91 59 . 77 72
4 4 100 113 76 . 109 104
4 5 96 109 77 . 105 145
;
data demand;
input customer
grade1 grade2 grade3 grade4 grade5 grade6;
datalines;
1 100 100 150 150 175 250
2 300 125 300 275 310 325
3 400 0 400 500 340 0
4 250 0 750 750 0 0
5 0 600 300 0 210 360
;
data resource;
input machine
grade1 grade2 grade3 grade4 grade5 grade6 avail;
datalines;
1 .250 .275 .300 .350 .310 .295 744
2 .300 .300 .305 .315 .320 . 244
3 .350 . . .320 .315 .300 790
4 .280 .275 .260 . .250 .295 672
;
The linear program is built using the DATA step. The model is saved in a SAS data set in the sparse
input format for PROC LP. Each section of the following DATA step generates a piece of the linear
program. The rst section generates the objective function; the next section generates the demand
constraints; and the last section generates the machine resource availability constraints.
/
*
build the linear programming model
*
/
data model;
array grade{6} grade1-grade6;
length _type_ $ 8 _row_ $ 8 _col_ $ 8;
keep _type_ _row_ _col_ _coef_;
ncust=5;
nmach=4;
ngrade=6;
/
*
generate the objective function
*
/
_type_='MAX';
_row_='OBJ';
do k=1 to nmach;
do i=1 to ncust;
link readobj; /
*
read the objective coefficient data
*
/
do j=1 to ngrade;
if grade{j}^=. then do;
288 ! Chapter 5: The LP Procedure
_col_='X'||put(i,1.)||put(j,1.)||put(k,1.);
_coef_=grade{j};
output;
end;
end;
end;
end;
/
*
generate the demand constraints
*
/
do i=1 to ncust;
link readdmd; /
*
read the demand data
*
/
do j=1 to ngrade;
if grade{j}^=. then do;
_type_='EQ';
_row_='DEMAND'||put(i,1.)||put(j,1.);
_col_='_RHS_';
_coef_=grade{j};
output;
_type_=' ';
do k=1 to nmach;
_col_='X'||put(i,1.)||put(j,1.)||put(k,1.);
_coef_=1.0;
output;
end;
end;
end;
end;
/
*
generate the machine constraints
*
/
do k=1 to nmach;
link readres; /
*
read the machine data
*
/
_type_='LE';
_row_='MACHINE'||put(k,1.);
_col_='_RHS_';
_coef_=avail;
output;
_type_=' ';
do i=1 to ncust;
do j=1 to ngrade;
if grade{j}^=. then do;
_col_='X'||put(i,1.)||put(j,1.)||put(k,1.);
_coef_=grade{j};
output;
end;
end;
end;
end;
readobj: set object;
return;
readdmd: set demand;
return;
Example 5.12: An Assignment Problem ! 289
readres: set resource;
return;
run;
With the model built and saved in a data set, it is ready for solution using PROC LP. The following
program solves the model and saves the solution in the data set called PRIMAL:
/
*
solve the linear program
*
/
proc lp data=model sparsedata noprint primalout=primal;
run;
The following output is produced by PROC LP.
Output 5.12.1 An Assignment Problem
An Assignment Problem
The LP Procedure
Problem Summary
Objective Function Max OBJ
Rhs Variable _RHS_
Type Variable _type_
Problem Density (%) 5.31
Variables Number
Non-negative 120
Slack 4
Total 124
Constraints Number
LE 4
EQ 30
Objective 1
Total 35
290 ! Chapter 5: The LP Procedure
Output 5.12.1 continued
Solution Summary
Terminated Successfully
Objective Value 871426.03763
Phase 1 Iterations 0
Phase 2 Iterations 40
Phase 3 Iterations 0
Integer Iterations 0
Integer Solutions 0
Initial Basic Feasible Variables 36
Time Used (seconds) 0
Number of Inversions 3
Epsilon 1E-8
Infinity 1.797693E308
Maximum Phase 1 Iterations 100
Maximum Phase 2 Iterations 100
Maximum Phase 3 Iterations 99999999
Maximum Integer Iterations 100
Time Limit (seconds) 120
The solution is prepared for reporting using the DATA step, and a report is written using PROC
TABULATE.
/
*
report the solution
*
/
data solution;
set primal;
keep customer grade machine amount;
if substr(_var_,1,1)='X' then do;
if _value_^=0 then do;
customer = substr(_var_,2,1);
grade = substr(_var_,3,1);
machine = substr(_var_,4,1);
amount = _value_;
output;
end;
end;
run;
proc tabulate data=solution;
class customer grade machine;
var amount;
table (machine
*
customer), (grade
*
amount);
run;
The report shown in Output 5.12.2 gives the assignment of customer, grade of cloth, and machine
that maximizes the return and does not violate the machine resource availability.
Example 5.12: An Assignment Problem ! 291
Output 5.12.2 An Assignment Problem
An Assignment Problem
------------------------------------------------------------------------
| | grade |
| |---------------------------------------------------|
| | 1 | 2 | 3 | 4 |
| |------------+------------+------------+------------|
| | amount | amount | amount | amount |
| |------------+------------+------------+------------|
| | Sum | Sum | Sum | Sum |
|------------------+------------+------------+------------+------------|
|machine |customer | | | | |
|--------+---------| | | | |
|1 |1 | .| 100.00| 150.00| 150.00|
| |---------+------------+------------+------------+------------|
| |2 | .| .| 300.00| .|
| |---------+------------+------------+------------+------------|
| |3 | .| .| 256.72| 210.31|
| |---------+------------+------------+------------+------------|
| |4 | .| .| 750.00| .|
| |---------+------------+------------+------------+------------|
| |5 | .| 92.27| .| .|
|--------+---------+------------+------------+------------+------------|
|2 |3 | .| .| 143.28| .|
| |---------+------------+------------+------------+------------|
| |5 | .| .| 300.00| .|
|--------+---------+------------+------------+------------+------------|
|3 |2 | .| .| .| 275.00|
| |---------+------------+------------+------------+------------|
| |3 | .| .| .| 289.69|
| |---------+------------+------------+------------+------------|
| |4 | .| .| .| 750.00|
| |---------+------------+------------+------------+------------|
| |5 | .| .| .| .|
|--------+---------+------------+------------+------------+------------|
|4 |1 | 100.00| .| .| .|
| |---------+------------+------------+------------+------------|
| |2 | 300.00| 125.00| .| .|
| |---------+------------+------------+------------+------------|
| |3 | 400.00| .| .| .|
| |---------+------------+------------+------------+------------|
| |4 | 250.00| .| .| .|
| |---------+------------+------------+------------+------------|
| |5 | .| 507.73| .| .|
------------------------------------------------------------------------
(Continued)
292 ! Chapter 5: The LP Procedure
Output 5.12.2 continued
An Assignment Problem
----------------------------------------------
| | grade |
| |-------------------------|
| | 5 | 6 |
| |------------+------------|
| | amount | amount |
| |------------+------------|
| | Sum | Sum |
|------------------+------------+------------|
|machine |customer | | |
|--------+---------| | |
|1 |1 | 175.00| 250.00|
| |---------+------------+------------|
| |2 | .| .|
| |---------+------------+------------|
| |3 | .| .|
| |---------+------------+------------|
| |4 | .| .|
| |---------+------------+------------|
| |5 | .| .|
|--------+---------+------------+------------|
|2 |3 | 340.00| .|
| |---------+------------+------------|
| |5 | .| .|
|--------+---------+------------+------------|
|3 |2 | 310.00| 325.00|
| |---------+------------+------------|
| |3 | .| .|
| |---------+------------+------------|
| |4 | .| .|
| |---------+------------+------------|
| |5 | 210.00| 360.00|
|--------+---------+------------+------------|
|4 |1 | .| .|
| |---------+------------+------------|
| |2 | .| .|
| |---------+------------+------------|
| |3 | .| .|
| |---------+------------+------------|
| |4 | .| .|
| |---------+------------+------------|
| |5 | .| .|
----------------------------------------------
Example 5.13: A Scheduling Problem
Scheduling is an application area where techniques in model generation can be valuable. Problems
involving scheduling are often solved with integer programming and are similar to assignment
problems. In this example, you have eight one-hour time slots in each of ve days. You have to
Example 5.13: A Scheduling Problem ! 293
assign four people to these time slots so that each slot is covered on every day. You allow the people
to specify preference data for each slot on each day. In addition, there are constraints that must be
satised:
v Each person has some slots for which they are unavailable.
v Each person must have either slot 4 or 5 off for lunch.
v Each person can work only two time slots in a row.
v Each person can work only a specied number of hours in the week.
To formulate this problem, let i denote person, denote time slot, and k denote day. Then, let
.
i}k
= 1 if person i is assigned to time slot on day k, and 0 otherwise; let
i}k
denote the
preference of person i for slot on day k; and let h
i
denote the number of hours in a week that
person i will work. Then, you get
max

i}k

i}k
.
i}k
subject to

i
.
i}k
= 1 for all and k
.
i4k
.
i5k
_ 1 for all i and k
.
i,I,k
.
i,I1,k
.
i,I2,k
_ 2 for all i and k. and = 1. . . . . 6

}k
.
i}k
_ h
i
for all i
.
i}k
= 0 or 1 for all i and k such that
i}k
> 0.
otherwise .
i}k
= 0
To solve this problem, create a data set that has the hours and preference data for each individual,
time slot, and day. A 10 represents the most desirable time slot, and a 1 represents the least desirable
time slot. In addition, a 0 indicates that the time slot is not available.
data raw;
input name $ hour slot mon tue wed thu fri;
datalines;
marc 20 1 10 10 10 10 10
marc 20 2 9 9 9 9 9
marc 20 3 8 8 8 8 8
marc 20 4 1 1 1 1 1
marc 20 5 1 1 1 1 1
marc 20 6 1 1 1 1 1
marc 20 7 1 1 1 1 1
marc 20 8 1 1 1 1 1
mike 20 1 10 9 8 7 6
mike 20 2 10 9 8 7 6
mike 20 3 10 9 8 7 6
mike 20 4 10 3 3 3 3
mike 20 5 1 1 1 1 1
mike 20 6 1 2 3 4 5
mike 20 7 1 2 3 4 5
mike 20 8 1 2 3 4 5
bill 20 1 10 10 10 10 10
bill 20 2 9 9 9 9 9
bill 20 3 8 8 8 8 8
294 ! Chapter 5: The LP Procedure
bill 20 4 0 0 0 0 0
bill 20 5 1 1 1 1 1
bill 20 6 1 1 1 1 1
bill 20 7 1 1 1 1 1
bill 20 8 1 1 1 1 1
bob 20 1 10 9 8 7 6
bob 20 2 10 9 8 7 6
bob 20 3 10 9 8 7 6
bob 20 4 10 3 3 3 3
bob 20 5 1 1 1 1 1
bob 20 6 1 2 3 4 5
bob 20 7 1 2 3 4 5
bob 20 8 1 2 3 4 5
;
These data are read by the following DATA step, and an integer program is built to solve the problem.
The model is saved in the data set named MODEL. First, the objective function is built using the data
saved in the RAW data set. Then, the constraints requiring a person to be working in each time slot
are built. Next, the constraints allowing each person time for lunch are added. Then, the constraints
restricting people to only two consecutive hours are added. Next, the constraints limiting the time
that any one person works in a week are added. Finally, the constraints allowing a person to be
assigned only to a time slot for which he is available are added. The code to build each of these
constraints follows the formulation closely.
data model;
array workweek{5} mon tue wed thu fri;
array hours{4} hours1 hours2 hours3 hours4;
retain hours1-hours4;
set raw end=eof;
length _row_ $ 8 _col_ $ 8 _type_ $ 8;
keep _type_ _col_ _row_ _coef_;
if name='marc' then i=1;
else if name='mike' then i=2;
else if name='bill' then i=3;
else if name='bob' then i=4;
hours{i}=hour;
/
*
build the objective function
*
/
do k=1 to 5;
_col_='x'||put(i,1.)||put(slot,1.)||put(k,1.);
_row_='object';
_coef_=workweek{k}
*
1000;
output;
_row_='upper';
if workweek{k}^=0 then _coef_=1;
output;
_row_='integer';
Example 5.13: A Scheduling Problem ! 295
_coef_=1;
output;
end;
/
*
build the rest of the model
*
/
if eof then do;
_coef_=.;
_col_=' ';
_type_='upper';
_row_='upper';
output;
_type_='max';
_row_='object';
output;
_type_='int';
_row_='integer';
output;
/
*
every hour 1 person working
*
/
do j=1 to 8;
do k=1 to 5;
_row_='work'||put(j,1.)||put(k,1.);
_type_='eq';
_col_='_RHS_';
_coef_=1;
output;
_coef_=1;
_type_=' ';
do i=1 to 4;
_col_='x'||put(i,1.)||put(j,1.)||put(k,1.);
output;
end;
end;
end;
/
*
each person has a lunch
*
/
do i=1 to 4;
do k=1 to 5;
_row_='lunch'||put(i,1.)||put(k,1.);
_type_='le';
_col_='_RHS_';
_coef_=1;
output;
_coef_=1;
_type_=' ';
_col_='x'||put(i,1.)||'4'||put(k,1.);
output;
_col_='x'||put(i,1.)||'5'||put(k,1.);
output;
end;
end;
296 ! Chapter 5: The LP Procedure
/
*
work at most 2 slots in a row
*
/
do i=1 to 4;
do k=1 to 5;
do l=1 to 6;
_row_='seq'||put(i,1.)||put(k,1.)||put(l,1.);
_type_='le';
_col_='_RHS_';
_coef_=2;
output;
_coef_=1;
_type_=' ';
do j=0 to 2;
_col_='x'||put(i,1.)||put(l+j,1.)||put(k,1.);
output;
end;
end;
end;
end;
/
*
work at most n hours in a week
*
/
do i=1 to 4;
_row_='capacit'||put(i,1.);
_type_='le';
_col_='_RHS_';
_coef_=hours{i};
output;
_coef_=1;
_type_=' ';
do j=1 to 8;
do k=1 to 5;
_col_='x'||put(i,1.)||put(j,1.)||put(k,1.);
output;
end;
end;
end;
end;
run;
The model saved in the data set named MODEL is in the sparse format. The constraint that requires
one person to work in time slot 1 on day 2 is named WORK12; it is

i
.
i12
= 1.
The following model is saved in the MODEL data set (which has 1387 observations).
_TYPE_ _COL_ _ROW_ _COEF_
eq _RHS_ work12 1
x112 work12 1
x212 work12 1
x312 work12 1
x412 work12 1
Example 5.13: A Scheduling Problem ! 297
The model is solved using the LP procedure. The option PRIMALOUT=SOLUTION causes PROC
LP to save the primal solution in the data set named SOLUTION.
/
*
solve the linear program
*
/
proc lp sparsedata noprint primalout=solution
time=1000 maxit1=1000 maxit2=1000;
run;
The following DATA step below takes the solution data set SOLUTION and generates a report data
set named REPORT. It translates the variable names .
i}k
so that a more meaningful report can be
written. Then, the PROC TABULATE procedure is used to display a schedule showing how the eight
time slots are covered for the week.
/
*
report the solution
*
/
title 'Reported Solution';
data report;
set solution;
keep name slot mon tue wed thu fri;
if substr(_var_,1,1)='x' then do;
if _value_>0 then do;
n=substr(_var_,2,1);
slot=substr(_var_,3,1);
d=substr(_var_,4,1);
if n='1' then name='marc';
else if n='2' then name='mike';
else if n='3' then name='bill';
else name='bob';
if d='1' then mon=1;
else if d='2' then tue=1;
else if d='3' then wed=1;
else if d='4' then thu=1;
else fri=1;
output;
end;
end;
run;
proc format;
value xfmt 1=' xxx ';
run;
proc tabulate data=report;
class name slot;
var mon--fri;
table (slot
*
name), (mon tue wed thu fri)
*
sum=' '
*
f=xfmt.
/misstext=' ';
run;
Output 5.13.1 from PROC TABULATE summarizes the schedule. Notice that the constraint requiring
that a person be assigned to each possible time slot on each day is satised.
298 ! Chapter 5: The LP Procedure
Output 5.13.1 A Scheduling Problem
Reported Solution
-----------------------------------------------------------------
| | mon | tue | wed | thu | fri |
|------------------+--------+--------+--------+--------+--------|
|slot |name | | | | | |
|--------+---------| | | | | |
|1 |bill | xxx | xxx | xxx | xxx | xxx |
|--------+---------+--------+--------+--------+--------+--------|
|2 |bob | xxx | | | | |
| |---------+--------+--------+--------+--------+--------|
| |marc | | xxx | xxx | xxx | xxx |
|--------+---------+--------+--------+--------+--------+--------|
|3 |bob | | xxx | | | |
| |---------+--------+--------+--------+--------+--------|
| |marc | | | xxx | xxx | xxx |
| |---------+--------+--------+--------+--------+--------|
| |mike | xxx | | | | |
|--------+---------+--------+--------+--------+--------+--------|
|4 |mike | xxx | xxx | xxx | xxx | xxx |
|--------+---------+--------+--------+--------+--------+--------|
|5 |bob | xxx | xxx | xxx | xxx | xxx |
|--------+---------+--------+--------+--------+--------+--------|
|6 |bob | | xxx | | xxx | |
| |---------+--------+--------+--------+--------+--------|
| |marc | xxx | | | | |
| |---------+--------+--------+--------+--------+--------|
| |mike | | | xxx | | xxx |
|--------+---------+--------+--------+--------+--------+--------|
|7 |bill | xxx | | | | |
| |---------+--------+--------+--------+--------+--------|
| |bob | | | xxx | | |
| |---------+--------+--------+--------+--------+--------|
| |mike | | xxx | | xxx | xxx |
|--------+---------+--------+--------+--------+--------+--------|
|8 |bill | xxx | | | | |
| |---------+--------+--------+--------+--------+--------|
| |bob | | | | | xxx |
| |---------+--------+--------+--------+--------+--------|
| |mike | | xxx | xxx | xxx | |
-----------------------------------------------------------------
Recall that PROC LP puts a character string in the macro variable _ORLP_ that describes the
characteristics of the solution on termination. This string can be parsed using macro functions and
the information obtained can be used in report writing. The variable can be written to the log with
the command
%put &_orlp_;
which produces Figure 5.1.
Example 5.14: A Multicommodity Transshipment Problem with Fixed Charges ! 299
Figure 5.1 _ORLP_ Macro Variable
STATUS=SUCCESSFUL PHASE=3 OBJECTIVE=211000 P_FEAS=YES D_FEAS=YES
INT_ITER=0 INT_FEAS=1 ACTIVE=0 INT_BEST=211000 PHASE1_ITER=34
PHASE2_ITER=51 PHASE3_ITER=0
From this you learn, for example, that at termination the solution is integer optimal and has an
objective value of 211000.
Example 5.14: A Multicommodity Transshipment Problem with Fixed
Charges
The following example illustrates a DATA step program for generating a linear program to solve
a multicommodity network ow model that has xed charges. Consider a network consisting of
the following nodes: farm-a, farm-b, farm-c, Chicago, St. Louis, and New York. You can ship four
commodities from each farm to Chicago or St. Louis and from Chicago or St. Louis to New York.
The following table shows the unit shipping cost for each of the four commodities across each of the
arcs. The table also shows the supply (positive numbers) at each of the from nodes and the demand
(negative numbers) at each of the to nodes. The xed charge is a xed cost for shipping any nonzero
amount across an arc. For example, if any amount of any of the four commodities is sent from farm-c
to St. Louis, then a xed charge of 75 units is added to the shipping cost.
Table 5.13 Farms to Cities Network Problem
Unit Shipping Supply and Demand Fixed
From To Cost Charge
Node Node 1 2 3 4 1 2 3 4
farm-a Chicago 20 15 17 22 100 100 40 . 100
farm-b Chicago 15 15 15 30 100 200 50 50 75
farm-c Chicago 30 30 10 10 40 100 75 100 100
farm-a StLouis 30 25 27 22 . . . . 150
farm-c StLouis 10 9 11 10 . . . . 75
Chicago NY 75 75 75 75 -150 -200 -50 -75 200
StLouis NY 80 80 80 80 . . . . 200
The following program is designed to take the data in the form given in the preceding table. It builds
the node arc incidence matrix for a network given in this form and adds integer variables to capture
the xed charge using the type of constraints discussed in Example 5.8. The program solves the
model using PROC LP, saves the solution in the PRIMALOUT= data set named SOLUTION, and
displays the solution. The DATA step can be easily modied to handle larger problems with similar
structure.
300 ! Chapter 5: The LP Procedure
title 'Multi-commodity Transhipment Problem with Fixed-Charges';
%macro dooversd;
_coef_=sd{_i_};
if sd{_i_}>0 then do; /
*
the node is a supply node
*
/
_row_=from||' commodity'||put(_i_,2.);
if from^=' ' then output;
end;
else if sd{_i_}<0 then do; /
*
the node is a demand node
*
/
_row_=to||' commodity'||put(_i_,2.);
if to^=' ' then output;
end;
else if from^=' ' & to^=' ' then do; /
*
a transshipment node
*
/
_coef_=0;
_row_=from||' commodity'||put(_i_,2.); output;
_row_=to ||' commodity'||put(_i_,2.); output;
end;
%mend dooversd;
%macro dooverc;
_col_=arc||' commodity'||put(_i_,2.);
if from^=' ' & to^=' ' then do; /
*
add node arc incidence matrix
*
/
_type_='le'; _row_=from||' commodity'||put(_i_,2.);
_coef_=1; output;
_row_=to ||' commodity'||put(_i_,2.);
_coef_=-1; output;
_type_=' '; _row_='obj';
_coef_=c{_i_}; output;
/
*
add fixed charge variables
*
/
_type_='le'; _row_=arc;
_coef_=1; output;
_col_='_rhs_';
_type_=' ';
_coef_=0; output;
_col_=arc||'fx';
_coef_=-M; output;
_row_='int';
_coef_=1; output;
_row_='obj';
_coef_=fx; output;
_row_='upper';
_coef_=1; output;
end;
%mend dooverc;
data network;
retain M 1.0e6;
length _col_ $ 22 _row_ $ 22;
keep _type_ _col_ _row_ _coef_ ;
array sd sd1-sd4;
array c c1-c4;
input arc $10. from $ to $ c1 c2 c3 c4 sd1 sd2 sd3 sd4 fx;
/
*
for the first observation define some of the rows
*
/
Example 5.14: A Multicommodity Transshipment Problem with Fixed Charges ! 301
if _n_=1 then do;
_type_='upperbd'; _row_='upper'; output;
_type_='lowerbd'; _row_='lower'; output;
_type_='min'; _row_='obj'; output;
_type_='integer'; _row_='int'; output;
end;
_col_='_rhs_'; _type_='le';
do _i_ = 1 to dim(sd);
%dooversd;
end;
do _i_ = 1 to dim(c);
%dooverc;
end;
datalines;
a-Chicago farm-a Chicago 20 15 17 22 100 100 40 . 100
b-Chicago farm-b Chicago 15 15 15 30 100 200 50 50 75
c-Chicago farm-c Chicago 30 30 10 10 40 100 75 100 100
a-StLouis farm-a StLouis 30 25 27 22 . . . . 150
c-StLouis farm-c StLouis 10 9 11 10 . . . . 75
Chicago-NY Chicago NY 75 75 75 75 -150 -200 -50 -75 200
StLous-NY StLouis NY 80 80 80 80 . . . . 200
;
/
*
solve the model
*
/
proc lp sparsedata pout=solution noprint;
run;
/
*
print the solution
*
/
data;
set solution;
rename _var_=arc _value_=amount;
if _value_^=0 & _type_='NON-NEG';
run;
proc print;
id arc;
var amount;
run;
The results from this example are shown in Output 5.14.1. The NOPRINT option in the PROC LP
statement suppresses the Variable and Constraint Summary sections. This is useful when solving
large models for which a report program is available. Here, the solution is saved in data set SOLUTION
and reported using PROC PRINT. The solution shows the amount that is shipped over each arc.
302 ! Chapter 5: The LP Procedure
Output 5.14.1 Multicommodity Transshipment Problem with Fixed Charges
Multi-commodity Transhipment Problem with Fixed-Charges
arc amount
a-Chicago commodity 1 10
b-Chicago commodity 1 100
b-Chicago commodity 2 100
c-Chicago commodity 3 50
c-Chicago commodity 4 75
c-StLouis commodity 1 40
c-StLouis commodity 2 100
Chicago-NY commodity 1 110
Chicago-NY commodity 2 100
Chicago-NY commodity 3 50
Chicago-NY commodity 4 75
StLous-NY commodity 1 40
StLous-NY commodity 2 100
Example 5.15: Converting to an MPS-Format SAS Data Set
This example demonstrates the use of the MPSOUT= option to convert problem data in PROC LP
input format into an MPS-format SAS data set for use with the OPTLP procedure.
Consider the oil blending problem introduced in the section An Introductory Example on page 175.
Suppose you have saved the problem data in dense format by using the following DATA step:
data exdata;
input _id_ $17. a_light a_heavy brega naphthal naphthai
heatingo jet_1 jet_2 _type_ $ _rhs_;
datalines;
profit -175 -165 -205 0 0 0 300 300 max .
naphtha_l_conv .035 .030 .045 -1 0 0 0 0 eq 0
naphtha_i_conv .100 .075 .135 0 -1 0 0 0 eq 0
heating_o_conv .390 .300 .430 0 0 -1 0 0 eq 0
recipe_1 0 0 0 0 .3 .7 -1 0 eq 0
recipe_2 0 0 0 .2 0 .8 0 -1 eq 0
available 110 165 80 . . . . . upperbd .
;
If you decide to solve the problem by using the OPTLP procedure, you will need to convert the
data set exdata from dense format to MPS format. You can accomplish this by using the following
statements:
proc lp data=exdata mpsout=mpsdata;
run;
The MPS-format SAS data set mpsdata is shown in Output 5.15.1.
References ! 303
Output 5.15.1 Data Set mpsdata
Obs FIELD1 FIELD2 FIELD3 FIELD4 FIELD5 FIELD6
1 NAME PROBLEM . .
2 ROWS . .
3 MAX profit . .
4 E naphtha_l_conv . .
5 E naphtha_i_conv . .
6 E heating_o_conv . .
7 E recipe_1 . .
8 E recipe_2 . .
9 COLUMNS . .
10 a_light profit -175.000 naphtha_l_conv 0.035
11 a_light naphtha_i_conv 0.100 heating_o_conv 0.390
12 a_heavy profit -165.000 naphtha_l_conv 0.030
13 a_heavy naphtha_i_conv 0.075 heating_o_conv 0.300
14 brega profit -205.000 naphtha_l_conv 0.045
15 brega naphtha_i_conv 0.135 heating_o_conv 0.430
16 naphthal naphtha_l_conv -1.000 recipe_2 0.200
17 naphthai naphtha_i_conv -1.000 recipe_1 0.300
18 heatingo heating_o_conv -1.000 recipe_1 0.700
19 heatingo recipe_2 0.800 .
20 jet_1 profit 300.000 recipe_1 -1.000
21 jet_2 profit 300.000 recipe_2 -1.000
22 BOUNDS . .
23 UP .BOUNDS. a_light 110.000 .
24 UP .BOUNDS. a_heavy 165.000 .
25 UP .BOUNDS. brega 80.000 .
26 ENDATA . .
Now that the problem data is in MPS format, you can solve the problem by using the OPTLP
procedure. For more information, see Chapter 17, The OPTLP Procedure.
References
Bartels, R. (1971), A Stabilization of the Simplex Method, Numerical Mathematics, 16, 414434.
Bland, R. G. (1977), NewFinite Pivoting Rules for the Simplex Method, Mathematics of Operations
Research, 2, 103107.
Breau, R. and Burdet, C. A. (1974), Branch and Bound Experiments in Zero-One Programming,
Mathematical Programming Study, 2, 150.
Crowder, H., Johnson, E. L., and Padberg, M. W. (1983), Solving Large-Scale Zero-One Linear
Programming Problems, Operations Research, 31, 803834.
Dantzig, G. B. (1963), Linear Programming and Extensions, Princeton, NJ: Princeton University
Press.
304 ! Chapter 5: The LP Procedure
Garnkel, R. S. and Nemhauser, G. L. (1972), Integer Programming, New York: John Wiley & Sons.
Greenberg, H. J. (1978), Pivot Selection Tactics, in H. J. Greenberg, ed., Design and Implementation
of Optimization Software, 143174, Netherlands: Sijthoff & Noordhoff.
Hadley, G. (1962), Linear Programming, Reading, MA: Addison-Wesley.
Harris, P. (1975), Pivot Selection Methods of the Devex LP Code, Mathematical Programming
Study, 4, 3057.
Ignizio, J. P. (1976), Goal Programming and Extensions, Lexington, MA: D.C. Heath and Company.
Murtagh, B. A. (1981), Advanced Linear Programming, Computation and Practice, New York:
McGraw-Hill.
Nelson, M. (1992), The Data Compression Book, M&T Books.
Reid, J. K. (1975), A Sparsity-Exploiting Variant of the Bartels-Golub Decomposition for Linear
Programming Bases, Harwell Report CSS 20.
Reid, J. K. (1976), Fortran Subroutines for Handling Sparse Linear Programming Bases, Harwell
Report R 8269.
Savelsbergh, M. W. P. (1994), Preprocessing and Probing Techniques for Mixed Integer Program-
ming Problems, ORSA J. on Computing, 6, 445454.
Taha, H. A. (1975), Integer Programming, New York: Academic Press.
Chapter 6
The NLP Procedure
Contents
Overview: NLP Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
Getting Started: NLP Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
Introductory Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
Syntax: NLP Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
Functional Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
PROC NLP Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322
ARRAY Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
BOUNDS Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
BY Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
CRPJAC Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
DECVAR Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
GRADIENT Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344
HESSIAN Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344
INCLUDE Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
JACNLC Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
JACOBIAN Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
LABEL Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
LINCON Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
MATRIX Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
MIN, MAX, and LSQ Statements . . . . . . . . . . . . . . . . . . . . . . . 350
MINQUAD and MAXQUAD Statements . . . . . . . . . . . . . . . . . . . . 351
NLINCON Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
PROFILE Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353
Program Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
Details: NLP Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
Criteria for Optimality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
Optimization Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362
Finite-Difference Approximations of Derivatives . . . . . . . . . . . . . . . 373
Hessian and CRP Jacobian Scaling . . . . . . . . . . . . . . . . . . . . . . 375
Testing the Gradient Specication . . . . . . . . . . . . . . . . . . . . . . . 376
Termination Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377
Active Set Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377
Feasible Starting Point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
Line-Search Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380
306 ! Chapter 6: The NLP Procedure
Restricting the Step Length . . . . . . . . . . . . . . . . . . . . . . . . . . . 381
Computational Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382
Covariance Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385
Input and Output Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . 388
Displayed Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397
Missing Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399
Computational Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . 400
Memory Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403
Examples: NLP Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404
Example 6.1: Using the DATA= Option . . . . . . . . . . . . . . . . . . . . 404
Example 6.2: Using the INQUAD= Option . . . . . . . . . . . . . . . . . . 406
Example 6.3: Using the INEST=Option . . . . . . . . . . . . . . . . . . . . 407
Example 6.4: Restarting an Optimization . . . . . . . . . . . . . . . . . . . 409
Example 6.5: Approximate Standard Errors . . . . . . . . . . . . . . . . . 410
Example 6.6: Maximum Likelihood Weibull Estimation . . . . . . . . . . . . 417
Example 6.7: Simple Pooling Problem . . . . . . . . . . . . . . . . . . . . 425
Example 6.8: Chemical Equilibrium . . . . . . . . . . . . . . . . . . . . . 435
Example 6.9: Minimize Total Delay in a Network . . . . . . . . . . . . . . . 441
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446
Overview: NLP Procedure
The NLP procedure (NonLinear Programming) offers a set of optimization techniques for minimizing
or maximizing a continuous nonlinear function (.) of n decision variables, . = (.
1
. . . . . .
n
)
T
with lower and upper bound, linear and nonlinear, equality and inequality constraints. This can be
expressed as solving
min
xR
n (.)
subject to c
i
(.) = 0. i = 1. . . . . m
e
c
i
(.) _ 0. i = m
e
1. . . . . m
l
i
_ .
i
_ u
i
. i = 1. . . . . n
where is the objective function, the c
i
s are the nonlinear functions, and the l
i
s and u
i
s are the
lower and upper bounds. Problems of this type are found in many settings ranging from optimal
control to maximum likelihood estimation.
The NLP procedure provides a number of algorithms for solving this problem that take advantage of
special structure on the objective function and constraints. One example is the quadratic programming
problem:
min (max) (.) =
1
2
.
T
G. g
T
. b
subject to c
i
(.) = 0. i = 1. . . . . m
e
Overview: NLP Procedure ! 307
where G is an n n symmetric matrix, g = (g
1
. . . . . g
n
)
T
is a vector, b is a scalar, and the c
i
(.)s
are linear functions.
Another example is the least squares problem:
min (.) =
1
2
{
2
1
(.)
2
I
(.)]
subject to c
i
(.) = 0. i = 1. . . . . m
e
where the c
i
(.)s are linear functions, and
1
(.). . . . .
I
(.) are nonlinear functions of ..
The following problems are handled by PROC NLP:
v quadratic programming with an option for sparse problems
v unconstrained minimization/maximization
v constrained minimization/maximization
v linear complementarity problem
The following optimization techniques are supported in PROC NLP:
v Quadratic Active Set Technique
v Trust Region Method
v Newton-Raphson Method with Line Search
v Newton-Raphson Method with Ridging
v Quasi-Newton Methods
v Double Dogleg Method
v Conjugate Gradient Methods
v Nelder-Mead Simplex Method
v Levenberg-Marquardt Method
v Hybrid Quasi-Newton Methods
These optimization techniques require a continuous objective function , and all but one (NMSIMP)
require continuous rst-order derivatives of the objective function . Some of the techniques also
require continuous second-order derivatives. There are three ways to compute derivatives in PROC
NLP:
v analytically (using a special derivative compiler), the default method
v via nite-difference approximations
v via user-supplied exact or approximate numerical functions
308 ! Chapter 6: The NLP Procedure
Nonlinear programs can be input into the procedure in various ways. The objective, constraint, and
derivative functions are specied using the programming statements of PROC NLP. In addition,
information in SAS data sets can be used to dene the structure of objectives and constraints as well
as specify constants used in objectives, constraints and derivatives.
PROC NLP uses data sets to input various pieces of information:
v The DATA= data set enables you to specify data shared by all functions involved in a least
squares problem.
v The INQUAD= data set contains the arrays appearing in a quadratic programming problem.
v The INEST= data set species initial values for the decision variables, the values of con-
stants that are referred to in the program statements, and simple boundary and general linear
constraints.
v The MODEL= data set species a model (functions, constraints, derivatives) saved at a
previous execution of the NLP procedure.
PROC NLP uses data sets to output various results:
v The OUTEST= data set saves the values of the decision variables, the derivatives, the solution,
and the covariance matrix at the solution.
v The OUT= output data set contains variables generated in the program statements dening the
objective function as well as selected variables of the DATA= input data set, if available.
v The OUTMODEL= data set saves the programming statements. It can be used to input a
model in the MODEL= input data set.
Getting Started: NLP Procedure
The NLP procedure solves general nonlinear programs. It has several optimizers that are tuned to
best perform on a particular class of problems. Guidelines for choosing a particular optimizer for a
problem can be found in the section Optimization Algorithms on page 362.
Regardless of the selected optimizer, it is necessary to specify an objective function and constraints
that the optimal solution must satisfy. In PROC NLP, the objective function and the constraints are
specied using SAS programming statements that are similar to those used in the SAS DATA step.
Some of the differences are discussed in the section Program Statements on page 355 and in the
section ARRAY Statement on page 341. As with any programming language, there are many
different ways to specify the same problem. Some are more economical than others.
Introductory Examples ! 309
Introductory Examples
The following introductory examples illustrate how to get started using the NLP procedure.
An Unconstrained Problem
Consider the simple example of minimizing the Rosenbrock function (Rosenbrock 1960):
(.) =
1
2
{100(.
2
.
2
1
)
2
(1 .
1
)
2
]
=
1
2
{
2
1
(.)
2
2
(.)]. . = (.
1
. .
2
)
The minimum function value is (.
+
) = 0 at .
+
= (1. 1). This problem does not have any
constraints.
The following statements can be used to solve this problem:
proc nlp;
min f;
decvar x1 x2;
f1 = 10
*
(x2 - x1
*
x1);
f2 = 1 - x1;
f = .5
*
(f1
*
f1 + f2
*
f2);
run;
The MIN statement identies the symbol f that characterizes the objective function in terms of f1
and f2, and the DECVAR statement names the decision variables x1 and x2. Because there is no
explicit optimizing algorithm option specied (TECH=), PROC NLP uses the Newton-Raphson
method with ridging, the default algorithm when there are no constraints.
A better way to solve this problem is to take advantage of the fact that is a sum of squares of

1
and
2
and to treat it as a least squares problem. Using the LSQ statement instead of the MIN
statement tells the procedure that this is a least squares problem, which results in the use of one of
the specialized algorithms for solving least squares problems (for example, Levenberg-Marquardt).
proc nlp;
lsq f1 f2;
decvar x1 x2;
f1 = 10
*
(x2 - x1
*
x1);
f2 = 1 - x1;
run;
The LSQ statement results in the minimization of a function that is the sum of squares of functions
that appear in the LSQ statement. The least squares specication is preferred because it enables the
procedure to exploit the structure in the problem for numerical stability and performance.
310 ! Chapter 6: The NLP Procedure
PROC NLP displays the iteration history and the solution to this least squares problem as shown
in Figure 6.1. It shows that the solution has .
1
= 1 and .
2
= 1. As expected in an unconstrained
problem, the gradient at the solution is very close to 0.
Figure 6.1 Least Squares Minimization
PROC NLP: Least Squares Minimization
Levenberg-Marquardt Optimization
Scaling Update of More (1978)
Parameter Estimates 2
Functions (Observations) 2
Optimization Start
Active Constraints 0 Objective Function 7.7115046337
Max Abs Gradient Element 38.778863865 Radius 450.91265904
Actual
Max Abs Over
Rest Func Act Objective Obj Fun Gradient Pred
Iter arts Calls Con Function Change Element Lambda Change
1 0 2 0 7.41150 0.3000 77.0013 0 0.0389
2 0 3 0 1.9337E-28 7.4115 6.39E-14 0 1.000
Optimization Results
Iterations 2 Function Calls 4
Jacobian Calls 3 Active Constraints 0
Objective Function 1.933695E-28 Max Abs Gradient Element 6.394885E-14
Lambda 0 Actual Over Pred Change 1
Radius 7.7001288198
ABSGCONV convergence criterion satisfied.
PROC NLP: Least Squares Minimization
Optimization Results
Parameter Estimates
Gradient
Objective
N Parameter Estimate Function
1 x1 1.000000 6.394885E-14
2 x2 1.000000 -2.22045E-14
Value of Objective Function = 1.933695E-28
Introductory Examples ! 311
Boundary Constraints on the Decision Variables
Bounds on the decision variables can be used. Suppose, for example, that it is necessary to constrain
the decision variables in the previous example to be less than 0.5. That can be done by adding a
BOUNDS statement.
proc nlp;
lsq f1 f2;
decvar x1 x2;
bounds x1-x2 <= .5;
f1 = 10
*
(x2 - x1
*
x1);
f2 = 1 - x1;
run;
The solution in Figure 6.2 shows that the decision variables meet the constraint bounds.
Figure 6.2 Least Squares with Bounds Solution
PROC NLP: Least Squares Minimization
Scaling Update of More (1978)
PROC NLP: Least Squares Minimization
Optimization Results
Parameter Estimates
Gradient Active
Objective Bound
N Parameter Estimate Function Constraint
1 x1 0.500000 -0.500000 Upper BC
2 x2 0.250000 0
Linear Constraints on the Decision Variables
More general linear equality or inequality constraints of the form
n

}=1
a
i}
.
}
{_ [ = [ _] b
i
for i = 1. . . . . m
can be specied in a LINCON statement. For example, suppose that in addition to the bounds
constraints on the decision variables it is necessary to guarantee that the sum .
1
.
2
is less than or
equal to 0.6. That can be achieved by adding a LINCON statement:
proc nlp;
lsq f1 f2;
decvar x1 x2;
bounds x1-x2 <= .5;
lincon x1 + x2 <= .6;
f1 = 10
*
(x2 - x1
*
x1);
f2 = 1 - x1;
run;
312 ! Chapter 6: The NLP Procedure
The output in Figure 6.3 displays the iteration history and the convergence criterion.
Figure 6.3 Least Squares with Bounds and Linear Constraints Iteration History
PROC NLP: Least Squares Minimization
Levenberg-Marquardt Optimization
Scaling Update of More (1978)
Parameter Estimates 2
Functions (Observations) 2
Lower Bounds 0
Upper Bounds 2
Linear Constraints 1
Actual
Max Abs Over
Rest Func Act Objective Obj Fun Gradient Pred
Iter arts Calls Con Function Change Element Lambda Change
1 0 2 0' 0.23358 3.6205 3.3399 0 0.939
2 1 6 0' 0.16687 0.0667 0.4865 174.8 0.535
3 2 8 1 0.16679 0.000084 0.2677 0.00430 0.0008
4 2 9 1 0.16658 0.000209 0.000650 0 0.998
5 2 10 1 0.16658 1.233E-9 1.185E-6 0 0.998
Optimization Results
Iterations 5 Function Calls 11
Jacobian Calls 7 Active Constraints 1
Objective Function 0.1665792899 Max Abs Gradient Element 1.1847291E-6
Lambda 0 Actual Over Pred Change 0.9981768536
Radius 0.0000994255
ABSGCONV convergence criterion satisfied.
Figure 6.4 shows that the solution satises the linear constraint. Note that the procedure displays the
active constraints (the constraints that are tight) at optimality.
Figure 6.4 Least Squares with Bounds and Linear Constraints Solution
PROC NLP: Least Squares Minimization
Scaling Update of More (1978)
Introductory Examples ! 313
Figure 6.4 continued
PROC NLP: Least Squares Minimization
Optimization Results
Parameter Estimates
Gradient
Objective
N Parameter Estimate Function
1 x1 0.423645 -0.312000
2 x2 0.176355 -0.312000
Linear Constraints Evaluated at Solution
1 ACT 2.7756E-17 = 0.6000 - 1.0000
*
x1 - 1.0000
*
x2
Nonlinear Constraints on the Decision Variables
More general nonlinear equality or inequality constraints can be specied using an NLINCON
statement. Consider the least squares problem with the additional constraint
.
2
1
2.
2
_ 0
This constraint is specied by a new function c1 constrained to be greater than or equal to 0 in the
NLINCON statement. The function c1 is dened in the programming statements.
proc nlp tech=QUANEW;
min f;
decvar x1 x2;
bounds x1-x2 <= .5;
lincon x1 + x2 <= .6;
nlincon c1 >= 0;
c1 = x1
*
x1 - 2
*
x2;
f1 = 10
*
(x2 - x1
*
x1);
f2 = 1 - x1;
f = .5
*
(f1
*
f1 + f2
*
f2);
run;
Figure 6.5 shows the iteration history, and Figure 6.6 shows the solution to this problem.
314 ! Chapter 6: The NLP Procedure
Figure 6.5 Least Squares with Bounds, Linear and Nonlinear Constraints, Iteration History
PROC NLP: Nonlinear Minimization
Dual Quasi-Newton Optimization
Modified VMCWD Algorithm of Powell (1978, 1982)
Dual Broyden - Fletcher - Goldfarb - Shanno Update (DBFGS)
Lagrange Multiplier Update of Powell(1982)
Parameter Estimates 2
Lower Bounds 0
Upper Bounds 2
Linear Constraints 1
Nonlinear Constraints 1
Optimization Start
Objective Function 3.6940664349 Maximum Constraint 0
Violation
Maximum Gradient of the 24.167449944
Lagran Func
Maximum
Gradient
Element
Maximum Predicted of the
Function Objective Constraint Function Step Lagrange
Iter Restarts Calls Function Violation Reduction Size Function
1 0 9 1.33999 0 1.1315 0.558 7.172
2 0 10 0.81134 0 0.2944 1.000 2.896
3 0 11 0.61022 0 0.1518 1.000 2.531
4 0 12 0.49146 0 0.1575 1.000 1.736
5 0 13 0.37940 0 0.0957 1.000 0.464
6 0 14 0.34677 0 0.0367 1.000 0.603
7 0 15 0.33136 0 0.00254 1.000 0.257
8 0 16 0.33020 0 0.000332 1.000 0.0218
9 0 17 0.33003 0 3.92E-6 1.000 0.00200
10 0 18 0.33003 0 2.053E-8 1.000 0.00002
Optimization Results
Iterations 10 Function Calls 19
Gradient Calls 13 Active Constraints 1
Objective Function 0.3300307258 Maximum Constraint 0
Violation
Maximum Projected Gradient 9.4437885E-6 Value Lagrange Function 0.3300307155
Maximum Gradient of the 9.1683548E-6 Slope of Search Direction -2.053448E-8
Lagran Func
Introductory Examples ! 315
Figure 6.6 Least Squares with Bounds, Linear and Nonlinear Constraints, Solution
PROC NLP: Nonlinear Minimization
NOTE: At least one element of the (projected) gradient is greater than 1e-3.
PROC NLP: Nonlinear Minimization
Optimization Results
Parameter Estimates
Gradient Gradient
Objective Lagrange
N Parameter Estimate Function Function
1 x1 0.246960 0.753147 0.753147
2 x2 0.030495 -3.049459 -3.049459
Value of Objective Function = 0.3300307479
Linear Constraints Evaluated at Solution
1 0.32255 = 0.6000 - 1.0000
*
x1 - 1.0000
*
x2
Values of Nonlinear Constraints
Lagrange
Constraint Value Residual Multiplier
[ 2 ] c1_G 2.112E-8 2.112E-8 .
Not all of the optimization methods support nonlinear constraints. In particular the Levenberg-
Marquardt method, the default for LSQ, does not support nonlinear constraints. (For more informa-
tion about the particular algorithms, see the section Optimization Algorithms on page 362.) The
Quasi-Newton method is the prime choice for solving nonlinear programs with nonlinear constraints.
The option TECH=QUANEW in the PROC NLP statement causes the Quasi-Newton method to be
used.
A Simple Maximum Likelihood Example
The following is a very simple example of a maximum likelihood estimation problem with the log
likelihood function:
l(j. o) = log(o)
1
2
_
. j
o
_
2
The maximum likelihood estimates of the parameters j and o form the solution to
max
,c>0

i
l
i
(j. o)
316 ! Chapter 6: The NLP Procedure
where
l
i
(j. o) = log(o)
1
2
_
.
i
j
o
_
2
In the following DATA step, values for . are input into SAS data set X; this data set provides the
values of .
i
.
data x;
input x @@;
datalines;
1 3 4 5 7
;
In the following statements, the DATA=X specication drives the building of the objective function.
When each observation in the DATA=X data set is read, a new term l
i
(j. o) using the value of .
i
is
added to the objective function LOGLIK specied in the MAX statement.
proc nlp data=x vardef=n covariance=h pcov phes;
profile mean sigma / alpha=.5 .1 .05 .01;
max loglik;
parms mean=0, sigma=1;
bounds sigma > 1e-12;
loglik=-0.5
*
((x-mean)/sigma)
**
2-log(sigma);
run;
After a few iterations of the default Newton-Raphson optimization algorithm, PROC NLP produces
the results shown in Figure 6.7.
Figure 6.7 Maximum Likelihood Estimates
PROC NLP: Nonlinear Maximization
Optimization Results
Parameter Estimates
Approx Approx
N Parameter Estimate Std Err t Value Pr > |t|
1 mean 4.000000 0.894427 4.472136 0.006566
2 sigma 2.000000 0.632456 3.162278 0.025031
Optimization Results
Parameter Estimates
Gradient
Objective
N Parameter Function
1 mean -1.33149E-10
2 sigma 5.6064146E-9
Value of Objective Function = -5.965735903
Introductory Examples ! 317
In unconstrained maximization, the gradient (that is, the vector of rst derivatives) at the solution
must be very close to zero and the Hessian matrix at the solution (that is, the matrix of second
derivatives) must have nonpositive eigenvalues. The Hessian matrix is displayed in Figure 6.8.
Figure 6.8 Hessian Matrix
PROC NLP: Nonlinear Maximization
Hessian Matrix
mean sigma
mean -1.250000003 1.33149E-10
sigma 1.33149E-10 -2.500000014
Determinant = 3.1250000245
Matrix has Only Negative Eigenvalues
Under reasonable assumptions, the approximate standard errors of the estimates are the square roots
of the diagonal elements of the covariance matrix of the parameter estimates, which (because of the
COV=H specication) is the same as the inverse of the Hessian matrix. The covariance matrix is
shown in Figure 6.9.
Figure 6.9 Covariance Matrix
PROC NLP: Nonlinear Maximization
Covariance Matrix 2: H = (NOBS/d) inv(G)
mean sigma
mean 0.7999999982 4.260769E-11
sigma 4.260769E-11 0.3999999978
Factor sigm = 1
Determinant = 0.3199999975
Matrix has 2 Positive Eigenvalue(s)
The PROFILE statement computes the values of the prole likelihood condence limits on SIGMA
and MEAN, as shown in Figure 6.10.
318 ! Chapter 6: The NLP Procedure
Figure 6.10 Condence Limits
PROC NLP: Nonlinear Maximization
Wald and PL Confidence Limits
Profile Likelihood
N Parameter Estimate Alpha Confidence Limits
1 mean 4.000000 0.500000 3.384431 4.615569
1 mean . 0.100000 2.305716 5.694284
1 mean . 0.050000 1.849538 6.150462
1 mean . 0.010000 0.670351 7.329649
2 sigma 2.000000 0.500000 1.638972 2.516078
2 sigma . 0.100000 1.283506 3.748633
2 sigma . 0.050000 1.195936 4.358321
2 sigma . 0.010000 1.052584 6.064107
Wald and PL Confidence Limits
Wald Confidence Limits
3.396718 4.603282
2.528798 5.471202
2.246955 5.753045
1.696108 6.303892
1.573415 2.426585
0.959703 3.040297
0.760410 3.239590
0.370903 3.629097
Syntax: NLP Procedure ! 319
Syntax: NLP Procedure
Below are statements used in PROC NLP, listed in alphabetical order as they appear in the text that
follows.
PROC NLP options ;
ARRAY function names ;
BOUNDS boundary constraints ;
BY variables ;
CRPJAC variables ;
DECVAR function names ;
GRADIENT variables ;
HESSIAN variables ;
INCLUDE model les ;
JACNLC variables ;
JACOBIAN function names ;
LABEL decision variable labels ;
LINCON linear constraints ;
MATRIX matrix specication ;
MIN, MAX, or LSQ function names ;
MINQUAD or MAXQUAD matrix, vector, or number ;
NLINCON nonlinear constraints ;
PROFILE prole specication ;
Program Statements ; ;
Functional Summary
The following table outlines the options in PROC NLP classied by function.
Table 6.1 Functional Summary
Description Statement Option
Input Data Set Options:
Input data set PROC NLP DATA=
Initial values and constraints PROC NLP INEST=
Quadratic objective function PROC NLP INQUAD=
Program statements PROC NLP MODEL=
Skip missing value observations PROC NLP NOMISS
Output Data Set Options:
Variables and derivatives PROC NLP OUT=
Result parameter values PROC NLP OUTEST=
Program statements PROC NLP OUTMODEL=
Combine various OUT. . . statements PROC NLP OUTALL
320 ! Chapter 6: The NLP Procedure
Description Statement Option
CRP Jacobian in the OUTEST= data set PROC NLP OUTCRPJAC
Derivatives in the OUT= data set PROC NLP OUTDER=
Grid in the OUTEST= data set PROC NLP OUTGRID
Hessian in the OUTEST= data set PROC NLP OUTHESSIAN
Iterative output in the OUTEST= data set PROC NLP OUTITER
Jacobian in the OUTEST= data set PROC NLP OUTJAC
NLC Jacobian in the OUTEST= data set PROC NLP OUTNLCJAC
Time in the OUTEST= data set PROC NLP OUTTIME
Optimization Options:
Minimization method PROC NLP TECH=
Update technique PROC NLP UPDATE=
Version of optimization technique PROC NLP VERSION=
Line-search method PROC NLP LINESEARCH=
Line-search precision PROC NLP LSPRECISION=
Type of Hessian scaling PROC NLP HESCAL=
Start for approximated Hessian PROC NLP INHESSIAN=
Iteration number for update restart PROC NLP RESTART=
Initial Value Options:
Produce best grid points PROC NLP BEST=
Infeasible points in grid search PROC NLP INFEASIBLE
Pseudorandom initial values PROC NLP RANDOM=
Constant initial values PROC NLP INITIAL=
Derivative Options:
Finite-difference derivatives PROC NLP FD=
Finite-difference derivatives PROC NLP FDHESSIAN=
Compute nite-difference interval PROC NLP FDINT=
Use only diagonal of Hessian PROC NLP DIAHES
Test gradient specication PROC NLP GRADCHECK=
Constraint Options:
Range for active constraints PROC NLP LCEPSILON=
LM tolerance for deactivating PROC NLP LCDEACT=
Tolerance for dependent constraints PROC NLP LCSINGULAR=
Sum all observations for continuous functions NLINCON / SUMOBS
Evaluate each observation for continuous functions NLINCON / EVERYOBS
Termination Criteria Options:
Maximum number of function calls PROC NLP MAXFUNC=
Maximum number of iterations PROC NLP MAXITER=
Minimum number of iterations PROC NLP MINITER=
Upper limit on real time PROC NLP MAXTIME=
Absolute function convergence criterion PROC NLP ABSCONV=
Functional Summary ! 321
Description Statement Option
Absolute function convergence criterion PROC NLP ABSFCONV=
Absolute gradient convergence criterion PROC NLP ABSGCONV=
Absolute parameter convergence criterion PROC NLP ABSXCONV=
Relative function convergence criterion PROC NLP FCONV=
Relative function convergence criterion PROC NLP FCONV2=
Relative gradient convergence criterion PROC NLP GCONV=
Relative gradient convergence criterion PROC NLP GCONV2=
Relative parameter convergence criterion PROC NLP XCONV=
Used in FCONV, GCONV criterion PROC NLP FSIZE=
Used in XCONV criterion PROC NLP XSIZE=
Covariance Matrix Options:
Type of covariance matrix PROC NLP COV=
o
2
factor of COV matrix PROC NLP SIGSQ=
Determine factor of COV matrix PROC NLP VARDEF=
Absolute singularity for inertia PROC NLP ASINGULAR=
Relative M singularity for inertia PROC NLP MSINGULAR=
Relative V singularity for inertia PROC NLP VSINGULAR=
Threshold for Moore-Penrose inverse PROC NLP G4=
Tolerance for singular COV matrix PROC NLP COVSING=
Prole condence limits PROC NLP CLPARM=
Printed Output Options:
Display (almost) all printed output PROC NLP PALL
Suppress all printed output PROC NLP NOPRINT
Reduce some default output PROC NLP PSHORT
Reduce most default output PROC NLP PSUMMARY
Display initial values and gradients PROC NLP PINIT
Display optimization history PROC NLP PHISTORY
Display Jacobian matrix PROC NLP PJACOBI
Display crossproduct Jacobian matrix PROC NLP PCRPJAC
Display Hessian matrix PROC NLP PHESSIAN
Display Jacobian of nonlinear constraints PROC NLP PNLCJAC
Display values of grid points PROC NLP PGRID
Display values of functions in LSQ, MIN, MAX PROC NLP PFUNCTION
Display approximate standard errors PROC NLP PSTDERR
Display covariance matrix PROC NLP PCOV
Display eigenvalues for covariance matrix PROC NLP PEIGVAL
Print code evaluation problems PROC NLP PERROR
Print measures of real time PROC NLP PTIME
Display model program, variables PROC NLP LIST
Display compiled model program PROC NLP LISTCODE
Step Length Options:
Damped steps in line search PROC NLP DAMPSTEP=
322 ! Chapter 6: The NLP Procedure
Description Statement Option
Maximum trust region radius PROC NLP MAXSTEP=
Initial trust region radius PROC NLP INSTEP=
Prole Point and Condence Interval Options:
Factor relating discrepancy function to
2
quantile PROFILE FFACTOR=
Scale for , values written to OUTEST= data set PROFILE FORCHI=
Upper bound for condence limit search PROFILE FEASRATIO=
Write all condence limit parameter estimates to
OUTEST= data set
PROFILE OUTTABLE
Miscellaneous Options:
Number of accurate digits in objective function PROC NLP FDIGITS=
Number of accurate digits in nonlinear constraints PROC NLP CDIGITS=
General singularity criterion PROC NLP SINGULAR=
Do not compute inertia of matrices PROC NLP NOEIGNUM
Check optimality in neighborhood PROC NLP OPTCHECK=
PROC NLP Statement
PROC NLP options ;
This statement invokes the NLP procedure. The following options are used with the PROC NLP
statement.
ABSCONV=r
ABSTOL=r
species an absolute function convergence criterion. For minimization (maximization),
termination requires (.
(k)
) _ (_) r. The default value of ABSCONV is the negative
(positive) square root of the largest double precision value.
ABSFCONV=rn|
ABSFTOL=rn|
species an absolute function convergence criterion. For all techniques except NMSIMP,
termination requires a small change of the function value in successive iterations:
[(.
(k-1)
) (.
(k)
)[ _ r
For the NMSIMP technique the same formula is used, but .
(k)
is dened as the vertex with
the lowest function value, and .
(k-1)
is dened as the vertex with the highest function value
in the simplex. The default value is r = 0. The optional integer value n species the number
of successive iterations for which the criterion must be satised before the process can be
terminated.
PROC NLP Statement ! 323
ABSGCONV=rn|
ABSGTOL=rn|
species the absolute gradient convergence criterion. Termination requires the maximum
absolute gradient element to be small:
max
}
[g
}
(.
(k)
)[ _ r
This criterion is not used by the NMSIMP technique. The default value is r = 1E5. The
optional integer value n species the number of successive iterations for which the criterion
must be satised before the process can be terminated.
ABSXCONV=rn|
ABSXTOL=rn|
species the absolute parameter convergence criterion. For all techniques except NMSIMP,
termination requires a small Euclidean distance between successive parameter vectors:
[ .
(k)
.
(k-1)
[
2
_ r
For the NMSIMP technique, termination requires either a small length
(k)
of the vertices of a
restart simplex

(k)
_ r
or a small simplex size

(k)
_ r
where the simplex size
(k)
is dened as the 1
1
distance of the simplex vertex ,
(k)
with the
smallest function value to the other n simplex points .
(k)
I
= ,
(k)
:

(k)
=

x
l
y,
[ .
(k)
I
,
(k)
[
1
The default value is r = 1E4 for the COBYLA NMSIMP technique, r = 1E8 for the
standard NMSIMP technique, and r = 0 otherwise. The optional integer value n species the
number of successive iterations for which the criterion must be satised before the process can
be terminated.
ASINGULAR=r
ASING=r
species an absolute singularity criterion for measuring singularity of Hessian and crossprod-
uct Jacobian and their projected forms, which may have to be converted to compute the
covariance matrix. The default is the square root of the smallest positive double precision
value. For more information, see the section Covariance Matrix on page 385.
BEST=i
produces the i best grid points only. This option not only restricts the output, it also can
signicantly reduce the computation time needed for sorting the grid point information.
CDIGITS=r
species the number of accurate digits in nonlinear constraint evaluations. Fractional values
such as CDIGITS=4.7 are allowed. The default value is r = log
10
(c), where c is the
machine precision. The value of r is used to compute the interval length h for the computation
of nite-difference approximations of the Jacobian matrix of nonlinear constraints.
324 ! Chapter 6: The NLP Procedure
CLPARM= PL | WALD | BOTH
is similar to but not the same as that used by other SAS procedures. Using CLPARM=BOTH
is equivalent to specifying
PROFILE / ALPHA=0.5 0.1 0.05 0.01 OUTTABLE;
The CLPARM=BOTH option species that prole condence limits (PL CLs) for all param-
eters and for = .5. .1. .05. .01 are computed and displayed or written to the OUTEST=
data set. Computing the prole condence limits for all parameters can be very expensive
and should be avoided when a difcult optimization problem or one with many parameters
is solved. The OUTTABLE option is valid only when an OUTEST= data set is specied in
the PROC NLP statement. For CLPARM=BOTH, the table of displayed output contains the
Wald condence limits computed from the standard errors as well as the PL CLs. The Wald
condence limits are not computed (displayed or written to the OUTEST= data set) unless the
approximate covariance matrix of parameters is computed.
COV= 1 | 2 | 3 | 4 | 5 | 6 | M | H | J | B | E | U
COVARIANCE= 1 | 2 | 3 | 4 | 5 | 6 | M | H | J | B | E | U
species one of six formulas for computing the covariance matrix. For more information, see
the section Covariance Matrix on page 385.
COVSING=r
species a threshold r > 0 that determines whether the eigenvalues of a singular Hessian
matrix or crossproduct Jacobian matrix are considered to be zero. For more information, see
the section Covariance Matrix on page 385.
DAMPSTEP[=r]
DS[=r]
species that the initial step length value
(0)
for each line search (used by the QUANEW,
HYQUAN, CONGRA, or NEWRAP technique) cannot be larger than r times the step length
value used in the former iteration. If the DAMPSTEP option is specied but r is not specied,
the default is r = 2. The DAMPSTEP=r option can prevent the line-search algorithm from
repeatedly stepping into regions where some objective functions are difcult to compute or
where they could lead to oating point overows during the computation of objective functions
and their derivatives. The DAMPSTEP=r option can save time-costly function calls during the
line searches of objective functions that result in very small steps. For more information, see
the section Restricting the Step Length on page 381.
DATA=SAS-data-set
allows variables from the specied data set to be used in the specication of the objective
function . For more information, see the section DATA= Input Data Set on page 388.
DIAHES
species that only the diagonal of the Hessian or crossproduct Jacobian is used. This saves
function evaluations but may slow the convergence process considerably. Note that the
DIAHES option refers to both the Hessian and the crossproduct Jacobian when using the
LSQ statement. When derivatives are specied using the HESSIAN or CRPJAC statement,
these statements must refer only to the n diagonal derivative elements (otherwise, the n(n
1),2 derivatives of the lower triangle must be specied). The DIAHES option is ignored
PROC NLP Statement ! 325
if a quadratic programming with a constant Hessian is specied by TECH=QUADAS or
TECH=LICOMP.
FCONV=rn|
FTOL=rn|
species the relative function convergence criterion. For all techniques except NMSIMP,
termination requires a small relative change of the function value in successive iterations:
[(.
(k)
) (.
(k-1)
)[
max([(.
(k-1)
)[. FSIZE)
_ r
where FSIZE is dened by the FSIZE= option. For the NMSIMP technique, the same formula
is used, but .
(k)
is dened as the vertex with the lowest function value, and .
(k-1)
is dened
as the vertex with the highest function value in the simplex. The default value is r =
10
-FDIGITS
where FDIGITS is the value of the FDIGITS= option. The optional integer
value n species the number of successive iterations for which the criterion must be satised
before the process can be terminated.
FCONV2=rn|
FTOL2=rn|
FCONV2= option species another function convergence criterion. For least squares problems
and all techniques except NMSIMP, termination requires a small predicted reduction
J
(k)
~ (.
(k)
) (.
(k)
s
(k)
)
of the objective function. The predicted reduction
J
(k)
= g
(k)T
s
(k)

1
2
s
(k)T
G
(k)
s
(k)
=
1
2
s
(k)T
g
(k)
_ r
is based on approximating the objective function by the rst two terms of the Taylor series
and substituting the Newton step
s
(k)
= G
(k)-1
g
(k)
For the NMSIMP technique, termination requires a small standard deviation of the function
values of the n 1 simplex vertices .
(k)
I
. l = 0. . . . . n,
_
1
n 1

I
((.
(k)
I
) (.
(k)
))
2
_ r
where (.
(k)
) =
1
n1

I
(.
(k)
I
). If there are n
act
boundary constraints active at .
(k)
, the
mean and standard deviation are computed only for the n 1 n
act
unconstrained vertices.
The default value is r = 1E6 for the NMSIMP technique and the QUANEW technique with
nonlinear constraints, and r = 0 otherwise. The optional integer value n species the number
of successive iterations for which the criterion must be satised before the process can be
terminated.
326 ! Chapter 6: The NLP Procedure
FD[=FORWARD | CENTRAL | number ]
species that all derivatives be computed using nite-difference approximations. The following
specications are permitted:
FD=FORWARD uses forward differences.
FD=CENTRAL uses central differences.
FD=number uses central differences for the initial and nal evaluations of the gradient,
Jacobian, and Hessian. During iteration, start with forward differences and
switch to a corresponding central-difference formula during the iteration
process when one of the following two criteria is satised:
v The absolute maximum gradient element is less than or equal to num-
ber times the ABSGCONV threshold.
v The term left of the GCONV criterion is less than or equal to
max(1.0E 6. number GCONV threshold). The 1.0E6 ensures
that the switch is done, even if you set the GCONV threshold to zero.
FD is equivalent to FD=100.
Note that the FD and FDHESSIAN options cannot apply at the same time. The FDHESSIAN
option is ignored when only rst-order derivatives are used, for example, when the LSQ
statement is used and the HESSIAN is not explicitly needed (displayed or written to a data
set). For more information, see the section Finite-Difference Approximations of Derivatives
on page 373.
FDHESSIAN[=FORWARD | CENTRAL]
FDHES[=FORWARD | CENTRAL]
FDH[=FORWARD | CENTRAL]
species that second-order derivatives be computed using nite-difference approximations
based on evaluations of the gradients.
FDHESSIAN=FORWARD uses forward differences.
FDHESSIAN=CENTRAL uses central differences.
FDHESSIAN uses forward differences for the Hessian except
for the initial and nal output.
Note that the FD and FDHESSIAN options cannot apply at the same time. For more informa-
tion, see the section Finite-Difference Approximations of Derivatives on page 373
FDIGITS=r
species the number of accurate digits in evaluations of the objective function. Fractional
values such as FDIGITS=4.7 are allowed. The default value is r = log
10
(c), where c is the
machine precision. The value of r is used to compute the interval length h for the computation
of nite-difference approximations of the derivatives of the objective function and for the
default value of the FCONV= option.
FDINT= OBJ | CON | ALL
species how the nite-difference intervals h should be computed. For FDINT=OBJ, the
interval h is based on the behavior of the objective function; for FDINT=CON, the interval h is
PROC NLP Statement ! 327
based on the behavior of the nonlinear constraints functions; and for FDINT=ALL, the interval
h is based on the behavior of the objective function and the nonlinear constraints functions.
For more information, see the section Finite-Difference Approximations of Derivatives on
page 373.
FSIZE=r
species the FSIZE parameter of the relative function and relative gradient termination criteria.
The default value is r = 0. For more details, refer to the FCONV= and GCONV= options.
G4=n
is used when the covariance matrix is singular. The value n > 0 determines which generalized
inverse is computed. The default value of n is 60. For more information, see the section
Covariance Matrix on page 385.
GCONV=rn|
GTOL=rn|
species the relative gradient convergence criterion. For all techniques except the CONGRA
and NMSIMP techniques, termination requires that the normalized predicted function reduction
is small:
g(.
(k)
)
T
G
(k)
|
-1
g(.
(k)
)
max([(.
(k)
)[. FSIZE)
_ r
where FSIZE is dened by the FSIZE= option. For the CONGRA technique (where a reliable
Hessian estimate G is not available),
[ g(.
(k)
) [
2
2
[ s(.
(k)
) [
2
[ g(.
(k)
) g(.
(k-1)
) [
2
max([(.
(k)
)[. FSIZE)
_ r
is used. This criterion is not used by the NMSIMP technique. The default value is r = 1E8.
The optional integer value n species the number of successive iterations for which the criterion
must be satised before the process can be terminated.
GCONV2=rn|
GTOL2=rn|
GCONV2= option species another relative gradient convergence criterion,
max
}
[g
}
(.
(k)
)[
_
(.
(k)
)G
(k)
},}
_ r
This option is valid only when using the TRUREG, LEVMAR, NRRIDG, and NEWRAP
techniques on least squares problems. The default value is r = 0. The optional integer value n
species the number of successive iterations for which the criterion must be satised before
the process can be terminated.
GRADCHECK[= NONE | FAST | DETAIL]
GC[= NONE | FAST | DETAIL]
Specifying GRADCHECK=DETAIL computes a test vector and test matrix to check whether
the gradient g specied by a GRADIENT statement (or indirectly by a JACOBIAN statement)
is appropriate for the function computed by the program statements. If the specication
328 ! Chapter 6: The NLP Procedure
of the rst derivatives is correct, the elements of the test vector and test matrix should be
relatively small. For very large optimization problems, the algorithm can be too expensive
in terms of computer time and memory. If the GRADCHECK option is not specied, a fast
derivative test identical to the GRADCHECK=FAST specication is performed by default. It
is possible to suppress the default derivative test by specifying GRADCHECK=NONE. For
more information, see the section Testing the Gradient Specication on page 376.
HESCAL= 0 | 1 | 2 | 3
HS= 0 | 1 | 2 | 3
species the scaling version of the Hessian or crossproduct Jacobian matrix used in NRRIDG,
TRUREG, LEVMAR, NEWRAP, or DBLDOG optimization. If the value of the HESCAL=
option is not equal to zero, the rst iteration and each restart iteration sets the diagonal scaling
matrix D
(0)
= diag(J
(0)
i
):
J
(0)
i
=
_
max([G
(0)
i,i
[. c)
where G
(0)
i,i
are the diagonal elements of the Hessian or crossproduct Jacobian matrix. In all
other iterations, the diagonal scaling matrix D
(0)
= diag(J
(0)
i
) is updated depending on the
HESCAL= option:
HESCAL=0 species that no scaling is done
HESCAL=1 species the Mor (1978) scaling update:
J
(k1)
i
= max
_
J
(k)
i
.
_
max([G
(k)
i,i
[. c)
_
HESCAL=2 species the Dennis, Gay, and Welsch (1981) scaling update:
J
(k1)
i
= max
_
0.6J
(k)
i
.
_
max([G
(k)
i,i
[. c)
_
HESCAL=3 species that J
i
is reset in each iteration:
J
(k1)
i
=
_
max([G
(k)
i,i
[. c)
where c is the relative machine precision. The default value is HESCAL=1 for LEVMAR
minimization and HESCAL=0 otherwise. Scaling of the Hessian or crossproduct Jacobian
matrix can be time-consuming in the case where general linear constraints are active.
INEST=SAS-data-set
INVAR=SAS-data-set
ESTDATA=SAS-data-set
can be used to specify the initial values of the parameters dened in a DECVAR statement
as well as simple boundary constraints and general linear constraints. The INEST= data set
can contain additional variables with names corresponding to constants used in the program
statements. At the beginning of each run of PROC NLP, the values of the constants are read
from the PARMS observation, initializing the constants in the program statements. For more
information, see the section INEST= Input Data Set on page 388.
PROC NLP Statement ! 329
INFEASIBLE
IFP
species that the function values of both feasible and infeasible grid points are to be computed,
displayed, and written to the OUTEST= data set, although only the feasible grid points are
candidates for the starting point .
(0)
. This option enables you to explore the shape of the
objective function of points surrounding the feasible region. For the output, the grid points are
sorted rst with decreasing values of the maximum constraint violation. Points with the same
value of the maximum constraint violation are then sorted with increasing (minimization) or
decreasing (maximization) value of the objective function. Using the BEST= option restricts
only the number of best grid points in the displayed output, not those in the data set. The
INFEASIBLE option affects both the displayed output and the output saved to the OUTEST=
data set. The OUTGRID option can be used to write the grid points and their function values to
an OUTEST= data set. After small modications (deleting unneeded information), this data set
can be used with the G3D procedure of SAS/GRAPH to generate a three-dimensional surface
plot of the objective function depending on two selected parameters. For more information on
grids, see the section DECVAR Statement on page 343.
INHESSIAN[=r]
INHESS[=r]
species how the initial estimate of the approximate Hessian is dened for the quasi-Newton
techniques QUANEW, DBLDOG, and HYQUAN. There are two alternatives:
v The = r specication is not used: the initial estimate of the approximate Hessian is set
to the true Hessian or crossproduct Jacobian at .
(0)
.
v The = r specication is used: the initial estimate of the approximate Hessian is set to
the multiple of the identity matrix r1.
By default, if INHESSIAN=r is not specied, the initial estimate of the approximate Hessian is
set to the multiple of the identity matrix r1, where the scalar r is computed from the magnitude
of the initial gradient. For most applications, this is a sufciently good rst approximation.
INITIAL=r
species a value r as the common initial value for all parameters for which no other initial
value assignments by the DECVAR statement or an INEST= data set are made.
INQUAD=SAS-data-set
can be used to specify (the nonzero elements of) the matrix H, the vector g, and the scalar c
of a quadratic programming problem, (.) =
1
2
.
T
H. g
T
. c. This option cannot be
used together with the NLINCON statement. Two forms (dense and sparse) of the INQUAD=
data set can be used. For more information, see the section INQUAD= Input Data Set on
page 389.
INSTEP=r
For highly nonlinear objective functions, such as the EXP function, the default initial radius of
the trust region algorithms TRUREG, DBLDOG, or LEVMAR or the default step length of
the line-search algorithms can result in arithmetic overows. If this occurs, decreasing values
of 0 < r < 1 should be specied, such as INSTEP=1E1, INSTEP=1E2, INSTEP=1E4,
and so on, until the iteration starts successfully.
330 ! Chapter 6: The NLP Procedure
v For trust region algorithms (TRUREG, DBLDOG, LEVMAR), the INSTEP= option
species a factor r > 0 for the initial radius ^
(0)
of the trust region. The default initial
trust region radius is the length of the scaled gradient. This step corresponds to the
default radius factor of r = 1.
v For line-search algorithms (NEWRAP, CONGRA, QUANEW, HYQUAN), the INSTEP=
option species an upper bound for the initial step length for the line search during the
rst ve iterations. The default initial step length is r = 1.
v For the Nelder-Mead simplex algorithm (NMSIMP), the INSTEP=r option denes the
size of the initial simplex.
For more details, see the section Computational Problems on page 382.
LCDEACT=r
LCD=r
species a threshold r for the Lagrange multiplier that decides whether an active inequality
constraint remains active or can be deactivated. For a maximization (minimization), an active
inequality constraint can be deactivated only if its Lagrange multiplier is greater (less) than the
threshold value r. For maximization, r must be greater than zero; for minimization, r must be
smaller than zero. The default value is
r = min(0.01. max(0.1 ABSGCONV. 0.001 gma.
(k)
))
where the stands for maximization, the for minimization, ABSGCONV is the value of the
absolute gradient criterion, and gma.
(k)
is the maximum absolute element of the (projected)
gradient g
(k)
or 7
T
g
(k)
.
LCEPSILON=r
LCEPS=r
LCE=r
species the range r > 0 for active and violated boundary and linear constraints. During the
optimization process, the introduction of rounding errors can force PROC NLP to increase the
value of r by a factor of 10. 100. . . . . If this happens it is indicated by a message written to the
log. For more information, see the section Linear Complementarity (LICOMP) on page 366.
LCSINGULAR=r
LCSING=r
LCS=r
species a criterion r > 0 used in the update of the QR decomposition that decides whether
an active constraint is linearly dependent on a set of other active constraints. The default value
is r = 1E8. The larger r becomes, the more the active constraints are recognized as being
linearly dependent. If the value of r is larger than 0.1, it is reset to 0.1.
LINESEARCH=i
LIS=i
species the line-search method for the CONGRA, QUANEW, HYQUAN, and NEWRAP
optimization techniques. Refer to Fletcher (1987) for an introduction to line-search techniques.
The value of i can be 1. . . . . 8. For CONGRA, QUANEW, and NEWRAP, the default value is
PROC NLP Statement ! 331
i = 2. A special line-search method is the default for the least squares technique HYQUAN
that is based on an algorithm developed by Lindstrm and Wedin (1984). Although it needs
more memory, this default line-search method sometimes works better with large least squares
problems. However, by specifying LIS=i , i = 1. . . . . 8, it is possible to use one of the standard
techniques with HYQUAN.
LIS=1 species a line-search method that needs the same number of function and
gradient calls for cubic interpolation and cubic extrapolation.
LIS=2 species a line-search method that needs more function than gradient calls
for quadratic and cubic interpolation and cubic extrapolation; this method
is implemented as shown in Fletcher (1987) and can be modied to an
exact line search by using the LSPRECISION= option.
LIS=3 species a line-search method that needs the same number of function and
gradient calls for cubic interpolation and cubic extrapolation; this method is
implemented as shown in Fletcher (1987) and can be modied to an exact
line search by using the LSPRECISION= option.
LIS=4 species a line-search method that needs the same number of function and
gradient calls for stepwise extrapolation and cubic interpolation.
LIS=5 species a line-search method that is a modied version of LIS=4.
LIS=6 species golden section line search (Polak 1971), which uses only function
values for linear approximation.
LIS=7 species bisection line search (Polak 1971), which uses only function
values for linear approximation.
LIS=8 species the Armijo line-search technique (Polak 1971), which uses only
function values for linear approximation.
LIST
displays the model program and variable lists. The LIST option is a debugging feature and
is not normally needed. This output is not included in either the default output or the output
specied by the PALL option.
LISTCODE
displays the derivative tables and the compiled program code. The LISTCODE option is
a debugging feature and is not normally needed. This output is not included in either the
default output or the output specied by the PALL option. The option is similar to that used in
MODEL procedure in SAS/ETS software.
LSPRECISION=r
LSP=r
species the degree of accuracy that should be obtained by the line-search algorithms LIS=2
and LIS=3. Usually an imprecise line search is inexpensive and sufcient for convergence to
the optimum. For difcult optimization problems, a more precise and expensive line search may
be necessary (Fletcher 1987). The second (default for NEWRAP, QUANEW, and CONGRA)
and third line-search methods approach exact line search for small LSPRECISION= values. In
the presence of numerical problems, it is advised to decrease the LSPRECISION= value to
obtain a more precise line search. The default values are as follows:
332 ! Chapter 6: The NLP Procedure
TECH= UPDATE= LSP default
QUANEW DBFGS, BFGS r = 0.4
QUANEW DDFP, DFP r = 0.06
HYQUAN DBFGS r = 0.1
HYQUAN DDFP r = 0.06
CONGRA all r = 0.1
NEWRAP no update r = 0.9
For more details, refer to Fletcher (1987).
MAXFUNC=i
MAXFU=i
species the maximum number i of function calls in the optimization process. The default
values are
v TRUREG, LEVMAR, NRRIDG, NEWRAP: 125
v QUANEW, HYQUAN, DBLDOG: 500
v CONGRA, QUADAS: 1000
v NMSIMP: 3000
Note that the optimization can be terminated only after completing a full iteration. Therefore,
the number of function calls that are actually performed can exceed the number that is specied
by the MAXFUNC= option.
MAXITER=i n|
MAXIT=i n|
species the maximum number i of iterations in the optimization process. The default values
are:
v TRUREG, LEVMAR, NRRIDG, NEWRAP: 50
v QUANEW, HYQUAN, DBLDOG: 200
v CONGRA, QUADAS: 400
v NMSIMP: 1000
This default value is valid also when i is specied as a missing value. The optional second
value n is valid only for TECH=QUANEW with nonlinear constraints. It species an upper
bound n for the number of iterations of an algorithm used to reduce the violation of nonlinear
constraints at a starting point. The default value is n = 20.
MAXSTEP=rn|
species an upper bound for the step length of the line-search algorithms during the rst
n iterations. By default, r is the largest double precision value and n is the largest integer
available. Setting this option can increase the speed of convergence for TECH=CONGRA,
TECH=QUANEW, TECH=HYQUAN, and TECH=NEWRAP.
PROC NLP Statement ! 333
MAXTIME=r
species an upper limit of r seconds of real time for the optimization process. The default value
is the largest oating point double representation of the computer. Note that the time specied
by the MAXTIME= option is checked only once at the end of each iteration. Therefore,
the actual running time of the PROC NLP job may be longer than that specied by the
MAXTIME= option. The actual running time includes the rest of the time needed to nish the
iteration, time for the output of the (temporary) results, and (if required) the time for saving the
results in an OUTEST= data set. Using the MAXTIME= option with a permanent OUTEST=
data set enables you to separate large optimization problems into a series of smaller problems
that need smaller amounts of real time.
MINITER=i
MINIT=i
species the minimum number of iterations. The default value is zero. If more iterations
than are actually needed are requested for convergence to a stationary point, the optimization
algorithms can behave strangely. For example, the effect of rounding errors can prevent the
algorithm from continuing for the required number of iterations.
MODEL=model-name, model-list
MOD=model-name, model-list
MODFILE=model-name, model-list
reads the program statements from one or more input model les created by previous PROC
NLP steps using the OUTMODEL= option. If it is necessary to include the program code at a
special location in newly written code, the INCLUDE statement can be used instead of using
the MODEL= option. Using both the MODEL= option and the INCLUDE statement with the
same model le will include the same model twice, which can produce different results than
including it once. The MODEL= option is similar to the option used in PROC MODEL in
SAS/ETS software.
MSINGULAR=r
MSING=r
species a relative singularity criterion r > 0 for measuring singularity of Hessian and
crossproduct Jacobian and their projected forms. The default value is 1E12 if the SINGU-
LAR= option is not specied and max(10 c. 1E 4 SINGULAR) otherwise. For more
information, see the section Covariance Matrix on page 385.
NOEIGNUM
suppresses the computation and output of the determinant and the inertia of the Hessian,
crossproduct Jacobian, and covariance matrices. The inertia of a symmetric matrix are the
numbers of negative, positive, and zero eigenvalues. For large applications, the NOEIGNUM
option can save computer time.
NOMISS
is valid only for those variables of the DATA= data set that are referred to in program statements.
If the NOMISS option is specied, observations with any missing value for those variables are
skipped. If the NOMISS option is not specied, the missing value may result in a missing value
of the objective function, implying that the corresponding BY group of data is not processed.
334 ! Chapter 6: The NLP Procedure
NOPRINT
NOP
suppresses the output.
OPTCHECK[=r]
computes the function values (.
I
) of a grid of points .
I
in a small neighborhood of .
+
. The
.
I
are located in a ball of radius r about .
+
. If the OPTCHECK option is specied without r,
the default value is r = 0.1 at the starting point and r = 0.01 at the terminating point. If a
point .
+
I
is found with a better function value than (.
+
), then optimization is restarted at .
+
I
.
For more information on grids, see the section DECVAR Statement on page 343.
OUT=SAS-data-set
creates an output data set that contains those variables of a DATA= input data set referred
to in the program statements plus additional variables computed by performing the program
statements of the objective function, derivatives, and nonlinear constraints. The OUT= data set
can also contain rst- and second-order derivatives of these variables if the OUTDER= option
is specied. The variables and derivatives are evaluated at .
+
; for TECH=NONE, they are
evaluated at .
0
.
OUTALL
If an OUTEST= data set is specied, this option sets the OUTHESSIAN option if the MIN
or MAX statement is used. If the LSQ statement is used, the OUTALL option sets the
OUTCRPJAC option. If nonlinear constraints are specied using the NLINCON statement,
the OUTALL option sets the OUTNLCJAC option.
OUTCRPJAC
If an OUTEST= data set is specied, the crossproduct Jacobian matrix of the m functions
composing the least squares function is written to the OUTEST= data set.
OUTDER= 0 | 1 | 2
species whether or not derivatives are written to the OUT= data set. For OUTDER=2, rst-
and second-order derivatives are written to the data set; for OUTDER=1, only rst-order
derivatives are written; for OUTDER=0, no derivatives are written to the data set. The default
value is OUTDER=0. Derivatives are evaluated at .
+
.
OUTEST=SAS-data-set
OUTVAR=SAS-data-set
creates an output data set that contains the results of the optimization. This is useful for report-
ing and for restarting the optimization in a subsequent execution of the procedure. Information
in the data set can include parameter estimates, gradient values, constraint information, La-
grangian values, Hessian values, Jacobian values, covariance, standard errors, and condence
intervals.
OUTGRID
writes the grid points and their function values to the OUTEST= data set. By default, only the
feasible grid points are saved; however, if the INFEASIBLE option is specied, all feasible and
infeasible grid points are saved. Note that the BEST= option does not affect the output of grid
points to the OUTEST= data set. For more information on grids, see the section DECVAR
Statement on page 343.
PROC NLP Statement ! 335
OUTHESSIAN
OUTHES
writes the Hessian matrix of the objective function to the OUTEST= data set. If the Hessian
matrix is computed for some other reason (if, for example, the PHESSIAN option is specied),
the OUTHESSIAN option is set by default.
OUTITER
writes during each iteration the parameter estimates, the value of the objective function, the
gradient (if available), and (if OUTTIME is specied) the time in seconds from the start of the
optimization to the OUTEST= data set.
OUTJAC
writes the Jacobian matrix of the m functions composing the least squares function to the
OUTEST= data set. If the PJACOBI option is specied, the OUTJAC option is set by default.
OUTMODEL=model-name
OUTMOD=model-name
OUTM=model-name
species the name of an output model le to which the program statements are to be written.
The program statements of this le can be included into the program statements of a succeeding
PROC NLP run using the MODEL= option or the INCLUDE program statement. The
OUTMODEL= option is similar to the option used in PROC MODEL in SAS/ETS software.
Note that the following statements are not part of the program code that is written to an
OUTMODEL= data set: MIN, MAX, LSQ, MINQUAD, MAXQUAD, DECVAR, BOUNDS,
BY, CRPJAC, GRADIENT, HESSIAN, JACNLC, JACOBIAN, LABEL, LINCON, MATRIX,
and NLINCON.
OUTNLCJAC
If an OUTEST= data set is specied, the Jacobian matrix of the nonlinear constraint functions
specied by the NLINCON statement is written to the OUTEST= data set. If the Jacobian
matrix of the nonlinear constraint functions is computed for some other reason (if, for example,
the PNLCJAC option is specied), the OUTNLCJAC option is set by default.
OUTTIME
is used if an OUTEST= data set is specied and if the OUTITER option is specied. If
OUTTIME is specied, the time in seconds from the start of the optimization to the start of
each iteration is written to the OUTEST= data set.
PALL
ALL
displays all optional output except the output generated by the PSTDERR, PCOV, LIST, or
LISTCODE option.
PCOV
displays the covariance matrix specied by the COV= option. The PCOV option is set
automatically if the PALL and COV= options are set.
336 ! Chapter 6: The NLP Procedure
PCRPJAC
PJTJ
displays the n n crossproduct Jacobian matrix J
T
J. If the PALL option is specied and the
LSQ statement is used, this option is set automatically. If general linear constraints are active
at the solution, the projected crossproduct Jacobian matrix is also displayed.
PEIGVAL
displays the distribution of eigenvalues if a G4 inverse is computed for the covariance matrix.
The PEIGVAL option is useful for observing which eigenvalues of the matrix are recognized
as zero eigenvalues when the generalized inverse is computed, and it is the basis for setting the
COVSING= option in a subsequent execution of PROC NLP. For more information, see the
section Covariance Matrix on page 385.
PERROR
species additional output for such applications where the program code for objective function
or nonlinear constraints cannot be evaluated during the iteration process. The PERROR option
is set by default during the evaluations at the starting point but not during the optimization
process.
PFUNCTION
displays the values of all functions specied in a LSQ, MIN, or MAX statement for each
observation read fom the DATA= input data set. The PALL option sets the PFUNCTION
option automatically.
PGRID
displays the function values from the grid search. For more information on grids, see the
section DECVAR Statement on page 343.
PHESSIAN
PHES
displays the n n Hessian matrix G. If the PALL option is specied and the MIN or MAX
statement is used, this option is set automatically. If general linear constraints are active at the
solution, the projected Hessian matrix is also displayed.
PHISTORY
PHIS
displays the optimization history. No optimization history is displayed for TECH=LICOMP.
This output is included in both the default output and the output specied by the PALL option.
PINIT
PIN
displays the initial values and derivatives (if available). This output is included in both the
default output and the output specied by the PALL option.
PJACOBI
PJAC
displays the m n Jacobian matrix J. Because of the memory requirement for large least
squares problems, this option is not invoked when using the PALL option.
PROC NLP Statement ! 337
PNLCJAC
displays the Jacobian matrix of nonlinear constraints specied by the NLINCON statement.
The PNLCJAC option is set automatically if the PALL option is specied.
PSHORT
SHORT
PSH
restricts the amount of default output. If PSHORT is specied, then
v The initial values are not displayed.
v The listing of constraints is not displayed.
v If there is more than one function in the MIN, MAX, or LSQ statement, their values are
not displayed.
v If the GRADCHECK option is used, only the test vector is displayed.
PSTDERR
STDERR
SE
computes standard errors that are dened as square roots of the diagonal elements of the
covariance matrix. The t values and probabilities > [t [ are displayed together with the
approximate standard errors. The type of covariance matrix must be specied using the
COV= option. The SIGSQ= option, the VARDEF= option, and the special variables _NOBS_
and _DF_ dened in the program statements can be used to dene a scalar factor o
2
of the
covariance matrix and the approximate standard errors. For more information, see the section
Covariance Matrix on page 385.
PSUMMARY
SUMMARY
SUM
restricts the amount of default displayed output to a short form of iteration history and notes,
warnings, and errors.
PTIME
species the output of four different but partially overlapping differences of real time:
v total running time
v total time for the evaluation of objective function, nonlinear constraints, and derivatives:
shows the total time spent executing the programming statements specifying the objective
function, derivatives, and nonlinear constraints, and (if necessary) their rst- and second-
order derivatives. This is the total time needed for code evaluation before, during, and
after iterating.
v total time for optimization: shows the total time spent iterating.
v time for some CMP parsing: shows the time needed for parsing the program statements
and its derivatives. In most applications this is a negligible number, but for applications
that contain ARRAY statements or DO loops or use an optimization technique with
analytic second-order derivatives, it can be considerable.
338 ! Chapter 6: The NLP Procedure
RANDOM=i
species a positive integer as a seed value for the pseudorandom number generator. Pseudo-
random numbers are used as the initial value .
(0)
.
RESTART=i
REST=i
species that the QUANEW, HYQUAN, or CONGRA algorithm is restarted with a steepest
descent/ascent search direction after at most i > 0 iterations. Default values are as follows:
v CONGRA with UPDATE=PB: restart is done automatically so specication of i is not
used
v CONGRA with UPDATE=PB: i = min(10n. 80), where n is the number of parameters
v QUANEW, HYQUAN: i is the largest integer available
SIGSQ=sq
species a scalar factor sq > 0 for computing the covariance matrix. If the SIGSQ= option
is specied, VARDEF=N is the default. For more information, see the section Covariance
Matrix on page 385.
SINGULAR=r
SING=r
species the singularity criterion r > 0 for the inversion of the Hessian matrix and crossprod-
uct Jacobian. The default value is 1E8. For more information, refer to the MSINGULAR=
and VSINGULAR= options.
TECH=name
TECHNIQUE=name
species the optimization technique. Valid values for it are as follows:
v CONGRA
chooses one of four different conjugate gradient optimization algorithms, which can
be more precisely specied with the UPDATE= option and modied with the LINE-
SEARCH= option. When this option is selected, UPDATE=PB by default. For n _ 400,
CONGRA is the default optimization technique.
v DBLDOG
performs a version of double dogleg optimization, which can be more precisely specied
with the UPDATE= option. When this option is selected, UPDATE=DBFGS by default.
v HYQUAN
chooses one of three different hybrid quasi-Newton optimization algorithms which can
be more precisely dened with the VERSION= option and modied with the LINE-
SEARCH= option. By default, VERSION=2 and UPDATE=DBFGS.
v LEVMAR
performs the Levenberg-Marquardt minimization. For n < 40, this is the default
minimization technique for least squares problems.
v LICOMP
solves a quadratic program as a linear complementarity problem.
PROC NLP Statement ! 339
v NMSIMP
performs the Nelder-Mead simplex optimization method.
v NONE
does not perform any optimization. This option can be used
to do grid search without optimization
to compute and display derivatives and covariance matrices which cannot be obtained
efciently with any of the optimization techniques
v NEWRAP
performs the Newton-Raphson optimization technique. The algorithm combines a line-
search algorithm with ridging. The line-search algorithm LINESEARCH=2 is the default.
v NRRIDG
performs the Newton-Raphson optimization technique. For n _ 40 and non-linear least
squares, this is the default.
v QUADAS
performs a special quadratic version of the active set strategy.
v QUANEW
chooses one of four quasi-Newton optimization algorithms which can be dened more
precisely with the UPDATE= option and modied with the LINESEARCH= option. This
is the default for 40 < n < 400 or if there are nonlinear constraints.
v TRUREG
performs the trust region optimization technique.
UPDATE=method
UPD=method
species the update method for the (dual) quasi-Newton, double dogleg, hybrid quasi-Newton,
or conjugate gradient optimization technique. Not every update method can be used with each
optimizer. For more information, see the section Optimization Algorithms on page 362.
Valid values for method are as follows:
BFGS performs the original BFGS (Broyden, Fletcher, Goldfarb, & Shanno) update of
the inverse Hessian matrix.
DBFGS performs the dual BFGS (Broyden, Fletcher, Goldfarb, & Shanno) update of the
Cholesky factor of the Hessian matrix.
DDFP performs the dual DFP (Davidon, Fletcher, & Powell) update of the Cholesky
factor of the Hessian matrix.
DFP performs the original DFP (Davidon, Fletcher, & Powell) update of the inverse
Hessian matrix.
PB performs the automatic restart update method of Powell (1977) and Beale (1972).
FR performs the Fletcher-Reeves update (Fletcher 1987).
PR performs the Polak-Ribiere update (Fletcher 1987).
CD performs a conjugate-descent update of Fletcher (1987).
340 ! Chapter 6: The NLP Procedure
VARDEF= DF | N
species the divisor J used in the calculation of the covariance matrix and approximate
standard errors. If the SIGSQ= option is not specied, the default value is VARDEF=DF;
otherwise, VARDEF=N is the default. For more information, see the section Covariance
Matrix on page 385.
VERSION= 1 | 2 | 3
VS= 1 | 2 | 3
species the version of the hybrid quasi-Newton optimization technique or the version of the
quasi-Newton optimization technique with nonlinear constraints.
For the hybrid quasi-Newton optimization technique,
VS=1 species version HY1 of Fletcher and Xu (1987).
VS=2 species version HY2 of Fletcher and Xu (1987).
VS=3 species version HY3 of Fletcher and Xu (1987).
For the quasi-Newton optimization technique with nonlinear constraints,
VS=1 species update of the j vector like Powell (1978a, b) (update like VF02AD).
VS=2 species update of the j vector like Powell (1982b) (update like VMCWD).
In both cases, the default value is VS=2.
VSINGULAR=r
VSING=r
species a relative singularity criterion r > 0 for measuring singularity of Hessian and
crossproduct Jacobian and their projected forms, which may have to be converted to compute
the covariance matrix. The default value is 1E8 if the SINGULAR= option is not specied
and the value of SINGULAR otherwise. For more information, see the section Covariance
Matrix on page 385.
XCONV=rn|
XTOL=rn|
species the relative parameter convergence criterion. For all techniques except NMSIMP,
termination requires a small relative parameter change in subsequent iterations:
max
}
[.
(k)
}
.
(k-1)
}
[
max([.
(k)
}
[. [.
(k-1)
}
[. XSIZE)
_ r
For the NMSIMP technique, the same formula is used, but .
(k)
}
is dened as the vertex with the
lowest function value and .
(k-1)
}
is dened as the vertex with the highest function value in the
simplex. The default value is r = 1E8 for the NMSIMP technique and r = 0 otherwise. The
optional integer value n species the number of successive iterations for which the criterion
must be satised before the process can be terminated.
XSIZE=r
species the parameter r > 0 of the relative parameter termination criterion. The default value
is r = 0. For more details, see the XCONV= option.
ARRAY Statement ! 341
ARRAY Statement
ARRAY arrayname [ dimensions ] [$] [variables and constants] ; ;
The ARRAY statement is similar to, but not the same as, the ARRAY statement in the SAS DATA
step. The ARRAY statement is used to associate a name (of no more than eight characters) with a
list of variables and constants. The array name is used with subscripts in the program to refer to the
array elements. The following code illustrates this:
array r[8] r1-r8;
do i = 1 to 8;
r[i] = 0;
end;
The ARRAY statement does not support all the features of the DATA step ARRAY statement. It
cannot be used to give initial values to array elements. Implicit indexing of variables cannot be
used; all array references must have explicit subscript expressions. Only exact array dimensions are
allowed; lower-bound specications are not supported and a maximum of six dimensions is allowed.
On the other hand, the ARRAY statement does allow both variables and constants to be used as
array elements. (Constant array elements cannot have values assigned to them.) Both dimension
specication and the list of elements are optional, but at least one must be given. When the list
of elements is not given or fewer elements than the size of the array are listed, array variables are
created by sufxing element numbers to the array name to complete the element list.
BOUNDS Statement
BOUNDS b_con [ , b_con. . . ] ;
where b_con is given in one of the following formats:
v number operator parameter_list operator number
v number operator parameter_list
v parameter_list operator number
and operator is _. <. _. >. or =.
Boundary constraints are specied with a BOUNDS statement. One- or two-sided boundary con-
straints are allowed. The list of boundary constraints are separated by commas. For example,
bounds 0 <= a1-a9 X <= 1, -1 <= c2-c5;
bounds b1-b10 y >= 0;
342 ! Chapter 6: The NLP Procedure
More than one BOUNDS statement can be used. If more than one lower (upper) bound for the same
parameter is specied, the maximum (minimum) of these is taken. If the maximum l
}
of all lower
bounds is larger than the minimum of all upper bounds u
}
for the same variable .
}
, the boundary
constraint is replaced by .
}
= l
}
= min(u
}
) dened by the minimum of all upper bounds specied
for .
}
.
BY Statement
BY variables ;
A BY statement can be used with PROC NLP to obtain separate analyses on DATA= data set
observations in groups dened by the BY variables. That means, for values of the TECH= option
other than NONE, an optimization problem is solved for each BY group separately. When a BY
statement appears, the procedure expects the input DATA= data set to be sorted in order of the BY
variables. If the input data set is not sorted in ascending order, it is necessary to use one of the
following alternatives:
v Use the SORT procedure with a similar BY statement to sort the data.
v Use the BY statement option NOTSORTED or DESCENDING in the BY statement for the
NLP procedure. As a cautionary note, the NOTSORTED option does not mean that the data
are unsorted but rather that the data are arranged in groups (according to values of the BY
variables) and that these groups are not necessarily in alphabetical or increasing numeric order.
v Use the DATASETS procedure (in Base SAS software) to create an index on the BY variables.
For more information on the BY statement, refer to the discussion in SAS Language Reference:
Concepts. For more information on the DATASETS procedure, refer to the SAS Procedures Guide.
CRPJAC Statement
CRPJAC variables ;
The CRPJAC statement denes the crossproduct Jacobian matrix J
T
J used in solving least squares
problems. For more information, see the section Derivatives on page 360. If the DIAHES option
is not specied, the CRPJAC statement lists n(n 1),2 variable names, which correspond to the
elements (J
T
J)
},k
. _ k of the lower triangle of the symmetric crossproduct Jacobian matrix
listed by rows. For example, the statements
DECVAR Statement ! 343
lsq f1-f3;
decvar x1-x3;
crpjac jj1-jj6;
correspond to the crossproduct Jacobian matrix
J
T
J =
_
_
JJ1 JJ2 JJ4
JJ2 JJ3 JJ5
JJ4 JJ5 JJ6
_
_
If the DIAHES option is specied, only the n diagonal elements must be listed in the CRPJAC
statement. The n rows and n columns of the crossproduct Jacobian matrix must be in the same
order as the n corresponding parameter names listed in the DECVAR statement. To specify the
values of nonzero derivatives, the variables specied in the CRPJAC statement have to be dened at
the left-hand side of algebraic expressions in programming statements. For example, consider the
Rosenbrock function:
proc nlp tech=levmar;
lsq f1 f2;
decvar x1 x2;
gradient g1 g2;
crpjac cpj1-cpj3;
f1 = 10
*
(x2 - x1
*
x1);
f2 = 1 - x1;
g1 = -200
*
x1
*
(x2 - x1
*
x1) - (1 - x1);
g2 = 100
*
(x2 - x1
*
x1);
cpj1 = 400
*
x1
*
x1 + 1 ;
cpj2 = -200
*
x1;
cpj3 = 100;
run;
DECVAR Statement
DECVAR name_list [=numbers] [, name_list [=numbers] ...] ;
VAR name_list [=numbers] [, name_list [=numbers] ...] ;
PARMS name_list [=numbers] [, name_list [=numbers] ...] ;
PARAMETERS name_list [=numbers] [, name_list [=numbers] ...] ;
The DECVAR statement lists the names of the n > 0 decision variables and species grid search
and initial values for an iterative optimization process. The decision variables listed in the DECVAR
statement cannot also be used in the MIN, MAX, MINQUAD, MAXQUAD, LSQ, GRADIENT,
HESSIAN, JACOBIAN, CRPJAC, or NLINCON statement.
The DECVAR statement contains a list of decision variable names (not separated by commas)
optionally followed by an equals sign and a list of numbers. If the number list consists of only one
344 ! Chapter 6: The NLP Procedure
number, this number denes the initial value for all the decision variables listed to the left of the
equals sign.
If the number list consists of more than one number, these numbers specify the grid locations for
each of the decision variables listed left of the equals sign. The TO and BY keywords can be used
to specify a number list for a grid search. When a grid of points is specied with a DECVAR
statement, PROC NLP computes the objective function value at each grid point and chooses the best
(feasible) grid point as a starting point for the optimization process. The use of the BEST= option
is recommended to save computing time and memory for the storing and sorting of all grid point
information. Usually only feasible grid points are included in the grid search. If the specied grid
contains points located outside the feasible region and you are interested in the function values at
those points, it is possible to use the INFEASIBLE option to compute (and display) their function
values as well.
GRADIENT Statement
GRADIENT variables ;
The GRADIENT statement denes the gradient vector which contains the rst-order derivatives of the
objective function with respect to .
1
. . . . . .
n
. For more information, see the section Derivatives
on page 360. To specify the values of nonzero derivatives, the variables specied in the GRADIENT
statement must be dened on the left-hand side of algebraic expressions in programming statements.
For example, consider the Rosenbrock function:
proc nlp tech=congra;
min y;
decvar x1 x2;
gradient g1 g2;
y1 = 10
*
(x2 - x1
*
x1);
y2 = 1 - x1;
y = .5
*
(y1
*
y1 + y2
*
y2);
g1 = -200
*
x1
*
(x2 - x1
*
x1) - (1 - x1);
g2 = 100
*
(x2 - x1
*
x1);
run;
HESSIAN Statement
HESSIAN variables ;
The HESSIAN statement denes the Hessian matrix G containing the second-order derivatives of the
objective function with respect to .
1
. . . . . .
n
. For more information, see the section Derivatives
on page 360.
INCLUDE Statement ! 345
If the DIAHES option is not specied, the HESSIAN statement lists n(n 1),2 variable names
which correspond to the elements G
},k
. _ k. of the lower triangle of the symmetric Hessian
matrix listed by rows. For example, the statements
min f;
decvar x1 - x3;
hessian g1-g6;
correspond to the Hessian matrix
G =
_
_
G1 G2 G4
G2 G3 G5
G4 G5 G6
_
_
=
_
_
d
2
,d.
2
1
d
2
,d.
1
d.
2
d
2
,d.
1
d.
3
d
2
,d.
2
d.
1
d
2
,d.
2
2
d
2
,d.
2
d.
3
d
2
,d.
3
d.
1
d
2
,d.
3
d.
2
d
2
,d.
2
3
_
_
If the DIAHES option is specied, only the n diagonal elements must be listed in the HESSIAN
statement. The n rows and n columns of the Hessian matrix G must correspond to the order of the
n parameter names listed in the DECVAR statement. To specify the values of nonzero derivatives,
the variables specied in the HESSIAN statement must be dened on the left-hand side of algebraic
expressions in the programming statements. For example, consider the Rosenbrock function:
proc nlp tech=nrridg;
min f;
decvar x1 x2;
gradient g1 g2;
hessian h1-h3;
f1 = 10
*
(x2 - x1
*
x1);
f2 = 1 - x1;
f = .5
*
(f1
*
f1 + f2
*
f2);
g1 = -200
*
x1
*
(x2 - x1
*
x1) - (1 - x1);
g2 = 100
*
(x2 - x1
*
x1);
h1 = -200
*
(x2 - 3
*
x1
*
x1) + 1;
h2 = -200
*
x1;
h3 = 100;
run;
INCLUDE Statement
INCLUDE model les ;
The INCLUDE statement can be used to append model code to the current model code. The contents
of included model les, created using the OUTMODEL= option, are inserted into the model program
at the position in which the INCLUDE statement appears.
346 ! Chapter 6: The NLP Procedure
JACNLC Statement
JACNLC variables ;
The JACNLC statement denes the Jacobian matrix for the system of constraint functions
c
1
(.). . . . . c
mc
(.). The statements list the mc n variable names which correspond to the ele-
ments CJ
i,}
, i = 1. . . . . mc: = 1. . . . . n, of the Jacobian matrix by rows.
For example, the statements
nlincon c1-c3;
decvar x1-x2;
jacnlc cj1-cj6;
correspond to the Jacobian matrix
CJ =
_
_
CJ1 CJ2
CJ3 CJ4
CJ5 CJ6
_
_
=
_
_
dc
1
,d.
1
dc
1
,d.
2
dc
2
,d.
1
dc
2
,d.
2
dc
3
,d.
1
dc
3
,d.
2
_
_
The mc rows of the Jacobian matrix must be in the same order as the mc corresponding names of
nonlinear constraints listed in the NLINCON statement. The n columns of the Jacobian matrix must
be in the same order as the n corresponding parameter names listed in the DECVAR statement. To
specify the values of nonzero derivatives, the variables specied in the JACNLC statement must be
dened on the left-hand side of algebraic expressions in programming statements.
For example,
array cd[3,4] cd1-cd12;
nlincon c1-c3 >= 0;
jacnlc cd1-cd12;
c1 = 8 - x1
*
x1 - x2
*
x2 - x3
*
x3 - x4
*
x4 -
x1 + x2 - x3 + x4;
c2 = 10 - x1
*
x1 - 2
*
x2
*
x2 - x3
*
x3 - 2
*
x4
*
x4 +
x1 + x4;
c3 = 5 - 2
*
x1
*
x2 - x2
*
x2 - x3
*
x3 - 2
*
x1 + x2 + x4;
cd[1,1]= -1 - 2
*
x1; cd[1,2]= 1 - 2
*
x2;
cd[1,3]= -1 - 2
*
x3; cd[1,4]= 1 - 2
*
x4;
cd[2,1]= 1 - 2
*
x1; cd[2,2]= -4
*
x2;
cd[2,3]= -2
*
x3; cd[2,4]= 1 - 4
*
x4;
cd[3,1]= -2 - 4
*
x1; cd[3,2]= 1 - 2
*
x2;
cd[3,3]= -2
*
x3; cd[3,4]= 1;
JACOBIAN Statement ! 347
JACOBIAN Statement
JACOBIAN variables ;
The JACOBIAN statement denes the JACOBIAN matrix J for a system of objective functions. For
more information, see the section Derivatives on page 360.
The JACOBIAN statement lists m n variable names that correspond to the elements J
i,}
, i =
1. . . . . m: = 1. . . . . n, of the Jacobian matrix listed by rows.
For example, the statements
lsq f1-f3;
decvar x1 x2;
jacobian j1-j6;
correspond to the Jacobian matrix
J =
_
_
J1 J2
J3 J4
J5 J6
_
_
=
_
_
d
1
,d.
1
d
1
,d.
2
d
2
,d.
1
d
2
,d.
2
d
3
,d.
1
d
3
,d.
2
_
_
The m rows of the Jacobian matrix must correspond to the order of the m function names listed in
the MIN, MAX, or LSQ statement. The n columns of the Jacobian matrix must correspond to the
order of the n decision variables listed in the DECVAR statement. To specify the values of nonzero
derivatives, the variables specied in the JACOBIAN statement must be dened on the left-hand side
of algebraic expressions in programming statements.
For example, consider the Rosenbrock function:
proc nlp tech=levmar;
array j[2,2] j1-j4;
lsq f1 f2;
decvar x1 x2;
jacobian j1-j4;
f1 = 10
*
(x2 - x1
*
x1);
f2 = 1 - x1;
j[1,1] = -20
*
x1;
j[1,2] = 10;
j[2,1] = -1;
j[2,2] = 0; /
*
is not needed
*
/
run;
The JACOBIAN statement is useful only if more than one objective function is given in the MIN,
MAX, or LSQ statement, or if a DATA= input data set species more than one function. If the
MIN, MAX, or LSQ statement contains only one objective function and no DATA= input data set
is used, the JACOBIAN and GRADIENT statements are equivalent. In the case of least squares
minimization, the crossproduct Jacobian is used as an approximate Hessian matrix.
348 ! Chapter 6: The NLP Procedure
LABEL Statement
LABEL variable=label [ ,variable=label. . . ] ;
The LABEL statement can be used to assign labels (up to 40 chararcters) to the decision variables
listed in the DECVAR statement. The INEST= data set can also be used to assign labels. The labels
are attached to the output and are used in an OUTEST= data set.
LINCON Statement
LINCON l_con [ , l_con . . . ] ;
where l_con is given in one of the following formats:
v linear_term operator number
v number operator linear_term
and linear_term is of the following form:
< [ >< number+ > variable < [ < number+ > variable . . . >
The value of operator can be one of the following: _. <. _. >. or =.
The LINCON statement species equality or inequality constraints
n

}=1
a
i}
.
}
{_ [ = [ _] b
i
for i = 1. . . . . m
separated by commas. For example, the constraint 4.
1
3.
2
= 0 is expressed as
decvar x1 x2;
lincon 4
*
x1 - 3
*
x2 = 0;
and the constraints
10.
1
.
2
_ 10
.
1
5.
2
_ 15
are expressed as
decvar x1 x2;
lincon 10 <= 10
*
x1 - x2,
x1 + 5
*
x2 >= 15;
MATRIX Statement ! 349
MATRIX Statement
MATRIX M_name pattern_denitions ;
The MATRIX statement denes a matrix H and the vector g, which can be given in the MINQUAD
or MAXQUAD statement. The matrix H and vector g are initialized to zero, so that only the nonzero
elements are given. The ve different forms of the MATRIX statement are illustrated with the
following example:
H =
_
_
_
_
100 10 1 0
10 100 10 1
1 10 100 10
0 1 10 100
_

_
g =
_
_
_
_
1
2
3
4
_

_
c = 0
Each MATRIX statement rst names the matrix or vector and then lists its elements. If more than
one MATRIX statement is given for the same matrix, the later denitions override the earlier ones.
The rows and columns in matrix H and vector g correspond to the order of decision variables in the
DECVAR statement.
v Full Matrix Denition: The MATRIX statement consists of H_name or g_name followed by
an equals sign and all (nonredundant) numerical values of the matrix H or vector g. Assuming
symmetry, only the elements of the lower triangular part of the matrix H must be listed. This
specication should be used mainly for small problems with almost dense H matrices.
MATRIX H= 100
10 100
1 10 100
0 1 10 100;
MATRIX G= 1 2 3 4;
v Band-diagonal Matrix Denition: This form of pattern denition is useful if the H matrix
has (almost) constant band-diagonal structure. The MATRIX statement consists of H_name
followed by empty brackets . |, an equals sign, and a list of numbers to be assigned to the
diagonal and successive subdiagonals.
MATRIX H[,]= 100 10 1;
MATRIX G= 1 2 3 4;
v Sparse Matrix Denitions: In each of the following three specication types, the H_name
or g_name is followed by a list of pattern denitions separated by commas. Each pattern
denition consists of a location specication in brackets on the left side of an equals sign that
is followed by a list of numbers.
(Sub)Diagonalwise: This form of pattern denition is useful if the H matrix contains
nonzero elements along diagonals or subdiagonals. The starting location is specied by
an index pair in brackets i. |. The expression k + num on the right-hand side species
that num is assigned to the elements i. |. . . . . i k 1. k 1| in a diagonal
direction of the H matrix. The special case k = 1 can be used to assign values to single
nonzero element locations in H.
350 ! Chapter 6: The NLP Procedure
MATRIX H [1,1]= 4
*
100,
[2,1]= 3
*
10,
[3,1]= 2
*
1;
MATRIX G [1,1]= 1 2 3 4;
Columnwise Starting in Diagonal: This form of pattern denition is useful if the H
matrix contains nonzero elements columnwise starting in the diagonal. The starting
location is specied by only one index in brackets . |. The k numbers at the right-hand
side are assigned to the elements . |. . . . . min( k 1. n). |.
MATRIX H [,1]= 100 10 1,
[,2]= 100 10 1,
[,3]= 100 10,
[,4]= 100;
MATRIX G [,1]= 1 2 3 4;
Rowwise Starting in First Column: This form of pattern denition is useful if the H
matrix contains nonzero elements rowwise ending in the diagonal. The starting location
is specied by only one index i in brackets i. |. The k numbers at the right-hand side
are assigned to the elements i. 1|. . . . . i. min(k. i )|.
MATRIX H [1,]= 100,
[2,]= 10 100,
[3,]= 1 10 100,
[4,]= 0 1 10 100;
MATRIX G [1,]= 1 2 3 4;
MIN, MAX, and LSQ Statements
MIN variables ;
MAX variables ;
LSQ variables ;
The MIN, MAX, or LSQ statement species the objective functions. Only one of the three statements
can be used at a time and at least one must be given. The MIN and LSQ statements are for minimizing
the objective function, and the MAX statement is for maximizing the objective function. The MIN,
MAX, or LSQ statement lists one or more variables naming the objective functions
i
, i = 1. . . . . m
(later dened by SAS program code).
v If the MIN or MAX statement lists m function names
1
. . . . .
n
, the objective function is
(.) =
n

i=1

i
v If the LSQ statement lists m function names
1
. . . . .
n
, the objective function is
(.) =
1
2
n

i=1

2
i
(.)
Note that the LSQ statement can be used only if TECH=LEVMAR or TECH=HYQUAN.
MINQUAD and MAXQUAD Statements ! 351
MINQUAD and MAXQUAD Statements
MINQUAD H_name [ , g_name [ , c_number ] ] ;
MAXQUAD H_name [ , g_name [ , c_number ] ] ;
The MINQUAD and MAXQUAD statements specify the matrix H, vector g, and scalar c that dene
a quadratic objective function. The MINQUAD statement is for minimizing the objective function
and the MAXQUAD statement is for maximizing the objective function.
The rows and columns in H and g correspond to the order of decision variables given in the
DECVAR statement. Specifying the objective function with a MINQUAD or MAXQUAD statement
indirectly denes the analytic derivatives for the objective function. Therefore, statements specifying
derivatives are not valid in these cases. Also, only use these statements when TECH=LICOMP or
TECH=QUADAS and no nonlinear constraints are imposed.
There are three ways of using the MINQUAD or MAXQUAD statement:
v Using ARRAY Statements:
The names H_name and g_name specied in the MINQUAD or MAXQUAD statement can be
used in ARRAY statements. This specication is mainly for small problems with almost dense
H matrices.
proc nlp pall;
array h[2,2] .4 0
0 4;
minquad h, -100;
decvar x1 x2 = -1;
bounds 2 <= x1 <= 50,
-50 <= x2 <= 50;
lincon 10 <= 10
*
x1 - x2;
run;
v Using Elementwise Setting:
The names H_name and g_name specied in the MINQUAD or MAXQUAD statement can
be followed directly by one-dimensional indices specifying the corresponding elements of
the matrix H and vector g. These element names can be used on the left side of numerical
assignments. The one-dimensional index value l following H_name, which corresponds to
the element H
i}
, is computed by l = (i 1)n . i _ . The matrix H and vector g are
initialized to zero, so that only the nonzero elements must be given. This specication is
efcient for small problems with sparse H matrices.
proc nlp pall;
minquad h, -100;
decvar x1 x2;
bounds 2 <= x1 <= 50,
-50 <= x2 <= 50;
lincon 10 <= 10
*
x1 - x2;
h1 = .4; h4 = 4;
run;
352 ! Chapter 6: The NLP Procedure
v Using MATRIX Statements:
The names H_name and g_name specied in the MINQUAD or MAXQUAD statement can be
used in MATRIX statements. There are different ways to specify the nonzero elements of the
matrix H and vector g by MATRIX statements. The following example illustrates one way to
use the MATRIX statement.
proc nlp all;
matrix h[1,1] = .4 4;
minquad h, -100;
decvar x1 x2 = -1;
bounds 2 <= x1 <= 50,
-50 <= x2 <= 50;
lincon 10 <= 10
*
x1 - x2;
run;
NLINCON Statement
NLINCON nlcon [ , nlcon ...] [ / option ] ;
NLC nlcon [ , nlcon ...] [ / option ] ;
where nlcon is given in one of the following formats:
v number operator variable_list operator number
v -number operator variable_list
v variable_list operator number
and operator is _. <. _. >. or =. The value of option can be SUMOBS or EVERYOBS.
General nonlinear equality and inequality constraints are specied with the NLINCON statement.
The syntax of the NLINCON statement is similar to that of the BOUNDS statement with two small
additions:
v The BOUNDS statement can contain only the names of decision variables. The NLINCON
statement can also contain the names of continuous functions of the decision variables. These
functions must be computed in the program statements, and since they can depend on the
values of some of the variables in the DATA= data set, there are two possibilities:
If the continuous functions should be summed across all observations read from the
DATA= data set, the NLINCON statement must be terminated by the / SUMOBS option.
If the continuous functions should be evaluated separately for each observation in the
data set, the NLINCON statement must be terminated by the / EVERYOBS option. One
constraint is generated for each observation in the data set.
PROFILE Statement ! 353
v If the continuous function should be evaluated only once for the entire data set, the NLINCON
statement has the same form as the BOUNDS statement. If this constraint does depend on the
values of variables in the DATA= data set, it is evaluated using the data of the rst observation.
One- or two-sided constraints can be specied in the NLINCON statement. However, equality
constraints must be one-sided. The pairs of operators (<,<=) and (>,>=) are treated in the same
way.
These three statements require the values of the three functions
1
.
2
.
3
to be between zero and ten,
and they are equivalent:
nlincon 0 <= v1-v3,
v1-v3 <= 10;
nlincon 0 <= v1-v3 <= 10;
nlincon 10 >= v1-v3 >= 0;
Also, consider the Rosen-Suzuki problem. It has three nonlinear inequality constraints:
8 .
2
1
.
2
2
.
2
3
.
2
4
.
1
.
2
.
3
.
4
_ 0
10 .
2
1
2.
2
2
.
2
3
2.
2
4
.
1
.
4
_ 0
5 2.
2
1
.
2
2
.
2
3
2.
1
.
2
.
4
_ 0
These are specied as
nlincon c1-c3 >= 0;
c1 = 8 - x1
*
x1 - x2
*
x2 - x3
*
x3 - x4
*
x4 -
x1 + x2 - x3 + x4;
c2 = 10 - x1
*
x1 - 2
*
x2
*
x2 - x3
*
x3 - 2
*
x4
*
x4 +
x1 + x4;
c3 = 5 - 2
*
x1
*
x1 - x2
*
x2 - x3
*
x3 - 2
*
x1 + x2 + x4;
NOTE: QUANEW and NMSIMP are the only optimization subroutines that support the NLINCON
statement.
PROFILE Statement
PROFILE parms [ / [ ALPHA= values ] [ options ] ] ;
where parms is given in the format pnam_1 pnam_2 . . . pnam_n, and values is the list of values in
(0,1).
The PROFILE statement
354 ! Chapter 6: The NLP Procedure
v writes the (.. ,) coordinates of prole points for each of the listed parameters to the OUTEST=
data set
v displays, or writes to the OUTEST= data set, the prole likelihood condence limits (PL CLs)
for the listed parameters for the specied values. If the approximate standard errors are
available, the corresponding Wald condence limits can be computed.
When computing the prole points or likelihood prole condence intervals, PROC NLP assumes
that a maximization of the log likelihood function is desired. Each point of the prole and each
endpoint of the condence interval is computed by solving corresponding nonlinear optimization
problems.
The keyword PROFILE must be followed by the names of parameters for which the prole or the
PL CLs should be computed. If the parameter name list is empty, the proles and PL CLs for all
parameters are computed. Then, optionally, the values follow. The list of values may contain TO
and BY keywords. Each element must satisfy 0 < < 1. The following is an example:
profile l11-l15 u1-u5 c /
alpha= .9 to .1 by -.1 .09 to .01 by -.01;
Duplicate values or values outside (0. 1) are automatically eliminated from the list.
A number of additional options can be specied.
FFACTOR=r species the factor relating the discrepancy function (0) to the
2
quantile.
The default value is r = 2.
FORCHI= F | CHI denes the scale for the , values written to the OUTEST= data set. For
FORCHI=F, the , values are scaled to the values of the log likelihood function
= (0); for FORCHI=CHI, the , values are scaled so that , =
2
. The
default value is FORCHI=F.
FEASRATIO=r species a factor of the Wald condence limit (or an approximation of it if
standard errors are not computed) dening an upper bound for the search for
condence limits. In general, the range of . values in the prole graph is
between r = 1 and r = 2 times the length of the corresponding Wald interval.
For many examples, the
2
quantiles corresponding to small values dene a
, level ,
1
2
q
1
(1 ), which is too far away from , to be reached by ,(.)
for . within the range of twice the Wald condence limit. The search for
an intersection with such a , level at a practically innite value of . can be
computationally expensive. A smaller value for r can speed up computation
time by restricting the search for condence limits to a region closer to .. The
default value of r = 1000 practically disables the FEASRATIO= option.
OUTTABLE species that the complete set 0 of parameter estimates rather than only
. = 0
}
for each condence limit is written to the OUTEST= data set. This
output can be helpful for further analyses on how small changes in . = 0
}
affect the changes in the 0
i
. i = .
For some applications, it may be computationally less expensive to compute the PL condence
limits for a few parameters than to compute the approximate covariance matrix of many parameters,
Program Statements ! 355
which is the basis for the Wald condence limits. However, the computation of the prole of the
discrepancy function and the corresponding CLs in general will be much more time-consuming than
that of the Wald CLs.
Program Statements
This section lists the program statements used to code the objective function and nonlinear constraints
and their derivatives, and it documents the differences between program statements in the NLP
procedure and program statements in the DATA step. The syntax of program statements used in
PROC NLP is identical to that used in the CALIS, GENMOD, and MODEL procedures (refer to the
SAS/ETS Users Guide).
Most of the program statements which can be used in the SAS DATA step can also be used in the
NLP procedure. See the SAS Language Guide or base SAS documentation for a description of the
SAS program statements.
ABORT;
CALL name [ ( expression [, expression . . . ] ) ];
DELETE;
DO [ variable = expression
[ TO expression ] [ BY expression ]
[, expression [ TO expression ] [ BY expression ] . . . ]
]
[ WHILE expression ] [ UNTIL expression ];
END;
GOTO statement_label;
IF expression;
IF expression THEN program_statement;
ELSE program_statement;
variable = expression;
variable + expression;
LINK statement_label;
PUT [ variable] [=] [...];
RETURN;
SELECT [( expression )];
STOP;
SUBSTR( variable, index, length ) = expression;
WHEN ( expression) program_statement;
OTHERWISE program_statement;
For the most part, the SAS program statements work as they do in the SAS DATA step as documented
in the SAS Language Guide. However, there are several differences that should be noted.
v The ABORT statement does not allow any arguments.
v The DO statement does not allow a character index variable. Thus
356 ! Chapter 6: The NLP Procedure
do i = 1,2,3;
is supported; however,
do i = A,B,C;
is not.
v The PUT statement, used mostly for program debugging in PROC NLP, supports only some
of the features of the DATA step PUT statement, and has some new features that the DATA
step PUT statement does not:
The PROC NLP PUT statement does not support line pointers, factored lists, iteration
factors, overprinting, _INFILE_, the colon (:) format modier, or $.
The PROC NLP PUT statement does support expressions, but the expression must be
enclosed inside of parentheses. For example, the following statement displays the square
root of x: put (sqrt(x));
The PROC NLP PUT statement supports the print item _PDV_ to print a formatted
listing of all variables in the program. For example, the following statement displays a
more readable listing of the variables than the _all_ print item: put _pdv_;
v The WHEN and OTHERWISE statements allow more than one target statement. That is,
DO/END groups are not necessary for multiple statement WHENs. For example, the following
syntax is valid:
SELECT;
WHEN ( exp1 ) stmt1;
stmt2;
WHEN ( exp2 ) stmt3;
stmt4;
END;
It is recommended to keep some kind of order in the input of NLP, that is, between the statements that
dene decision variables and constraints and the program code used to specify objective functions
and derivatives.
Use of Special Variables in Program Code
Except for the quadratic programming techniques (QUADAS and LICOMP) that do not execute
program statements during the iteration process, several special variables in the program code can be
used to communicate with PROC NLP in special situations:
v _OBS_ If a DATA= input data set is used, it is possible to access a variable _OBS_ which
contains the number of the observation processed from the data set. You should not change
the content of the _OBS_ variable. This variable enables you to modify the programming
statements depending on the observation number processed in the DATA= input data set. For
example, to set variable A to 1 when observation 10 is processed, and otherwise to 2, it is
possible to specify
IF _OBS_ = 10 THEN A=1; ELSE A=2;
Program Statements ! 357
v _ITER_ This variable is set by PROC NLP, and it contains the number of the current iteration
of the optimization technique as it is displayed in the optimization history. You should not
change the content of the _ITER_ variable. It is possible to read the value of this variable in
order to modify the programming statements depending on the iteration number processed.
For example, to display the content of the variables A, B, and C when there are more than 100
iterations processed, it is possible to use
IF _ITER_ > 100 THEN PUT A B C;
v _DPROC_ This variable is set by PROC NLP to indicate whether the code is called only
to obtain the values of the m objective functions
i
(_DPROC_=0) or whether specied
derivatives (dened by the GRADIENT, JACOBIAN, CRPJAC, or HESSIAN statement) also
have to be computed (_DPROC_=1). You should not change the content of the _DPROC_
variable. Checking the _DPROC_ variable makes it possible to save computer time by not
performing derivative code that is not needed by the current call. In particular, when a DATA=
input data set is used, the code is processed many times to compute only the function values.
If the programming statements in the program contain the specication of computationally
expensive rst- and second-order derivatives, you can put the derivative code in an IF statement
that is processed only if _DPROC_ is not zero.
v _INDF_ The _INDF_ variable is set by PROC NLP to inform you of the source of calls to the
function or derivative programming.
_INDF_=0 indicates the rst function call in a grid search. This is also the rst call evaluating
the programming statements if there is a grid search dened by grid values in the
DECVAR statement.
_INDF_=1 indicates further function calls in a grid search.
_INDF_=2 indicates the call for the feasible starting point. This is also the rst call evaluating
the programming statements if there is no grid search dened.
_INDF_=3 indicates calls from a gradient-checking algorithm.
_INDF_=4 indicates calls from the minimization algorithm. The _ITER_ variable contains
the iteration number.
_INDF_=5 If the active set algorithm leaves the feasible region (due to rounding errors), an
algorithm tries to return it into the feasible region; _INDF_=5 indicates a call that is done
when such a step is successful.
_INDF_=6 indicates calls from a factorial test subroutine that tests the neighborhood of a
point . for optimality.
_INDF_=7, 8 indicates calls from subroutines needed to compute nite-difference derivatives
using only values of the objective function. No nonlinear constraints are evaluated.
_INDF_=9 indicates calls from subroutines needed to compute second-order nite-difference
derivatives using analytic (specied) rst-order derivatives. No nonlinear constraints are
evaluated.
_INDF_=10 indicates calls where only the nonlinear constraints but no objective function are
needed. The analytic derivatives of the nonlinear constraints are computed.
_INDF_=11 indicates calls where only the nonlinear constraints but no objective function are
needed. The analytic derivatives of the nonlinear constraints are not computed.
358 ! Chapter 6: The NLP Procedure
_INDF_=-1 indicates the last call at the nal solution.
You should not change the content of the _INDF_ variable.
v _LIST_ You can set the _LIST_ variable to control the output during the iteration process:
_LIST_=0 is equivalent to the NOPRINT option. It suppresses all output.
_LIST_=1 is equivalent to the PSUMMARY but not the PHISTORY option. The optimiza-
tion start and termination messages are displayed. However, the PSUMMARY option
suppresses the output of the iteration history.
_LIST_=2 is equivalent to the PSHORT option or to a combination of the PSUMMARY
and PHISTORY options. The optimization start information, the iteration history, and
termination message are displayed.
_LIST_=3 is equivalent to not PSUMMARY, not PSHORT, and not PALL. The optimization
start information, the iteration history, and the termination message are displayed.
_LIST_=4 is equivalent to the PALL option. The extended optimization start information
(also containing settings of termination criteria and other control parameters) is displayed.
_LIST_=5 In addition to the iteration history, the vector .
(k)
of parameter estimates is
displayed for each iteration k.
_LIST_=6 In addition to the iteration history, the vector .
(k)
of parameter estimates and the
gradient g
(k)
(if available) of the objective function are displayed for each iteration k.
It is possible to set the _LIST_ variable in the program code to obtain more or less output in
each iteration of the optimization process. For example,
IF _ITER_ = 11 THEN _LIST_=5;
ELSE IF _ITER_ > 11 THEN _LIST_=1;
ELSE _LIST_=3;
v _TOOBIG_ The value of _TOOBIG_ is initialized to 0 by PROC NLP, but you can set it to 1
during the iteration, indicating problems evaluating the program statements. The objective
function and derivatives must be computable at the starting point. However, during the iteration
it is possible to set the _TOOBIG_ variable to 1, indicating that the programming statements
(computing the value of the objective function or the specied derivatives) cannot be performed
for the current value of .
k
. Some of the optimization techniques check the value of _TOOBIG_
and try to modify the parameter estimates so that the objective function (or derivatives) can be
computed in a following trial.
v _NOBS_ The value of the _NOBS_ variable is initialized by PROC NLP to the product of the
number of functions mfun specied in the MIN, MAX or LSQ statement and the number of
valid observations nobs in the current BY group of the DATA= input data set. The value of
the _NOBS_ variable is used for computing the scalar factor of the covariance matrix (see the
COV=, VARDEF=, and SIGSQ= options). If you reset the value of the _NOBS_ variable, the
value that is available at the end of the iteration is used by PROC NLP to compute the scalar
factor of the covariance matrix.
Details: NLP Procedure ! 359
v _DF_ The value of the _DF_ variable is initialized by PROC NLP to the number n of
parameters specied in the DECVAR statement. The value of the _DF_ variable is used
for computing the scalar factor J of the covariance matrix (see the COV=, VARDEF=, and
SIGSQ= options). If you reset the value of the _DF_ variable, the value that is available at
the end of the iteration is used by PROC NLP to compute the scalar factor of the covariance
matrix.
v _LASTF_ In each iteration (except the rst one), the value of the _LASTF_ variable is set
by PROC NLP to the nal value of the objective function that was achieved during the last
iteration. This value should agree with the value that is displayed in the iteration history and
that is written in the OUTEST= data set when the OUTITER option is specied.
Details: NLP Procedure
Criteria for Optimality
PROC NLP solves
min
xR
n (.)
subject to c
i
(.) = 0. i = 1. . . . . m
e
c
i
(.) _ 0. i = m
e
1. . . . . m
where is the objective function and the c
i
s are the constraint functions.
A point . is feasible if it satises all the constraints. The feasible region G is the set of all the
feasible points. A feasible point .
+
is a global solution of the preceding problem if no point in G
has a smaller function value than (.
+
). A feasible point .
+
is a local solution of the problem if
there exists some open neighborhood surrounding .
+
in that no point has a smaller function value
than (.
+
). Nonlinear programming algorithms cannot consistently nd global minima. All the
algorithms in PROC NLP nd a local minimum for this problem. If you need to check whether
the obtained solution is a global minimum, you may have to run PROC NLP with different starting
points obtained either at random or by selecting a point on a grid that contains G.
Every local minimizer .
+
of this problem satises the following local optimality conditions:
v The gradient (vector of rst derivatives) g(.
+
) = V(.
+
) of the objective function
(projected toward the feasible region if the problem is constrained) at the point .
+
is zero.
v The Hessian (matrix of second derivatives) G(.
+
) = V
2
(.
+
) of the objective function
(projected toward the feasible region G in the constrained case) at the point .
+
is positive
denite.
360 ! Chapter 6: The NLP Procedure
Most of the optimization algorithms in PROC NLP use iterative techniques that result in a sequence
of points .
0
. . . . . .
n
. . . ., that converges to a local solution .
+
. At the solution, PROC NLP performs
tests to conrm that the (projected) gradient is close to zero and that the (projected) Hessian matrix
is positive denite.
Karush-Kuhn-Tucker Conditions
An important tool in the analysis and design of algorithms in constrained optimization is the
Lagrangian function, a linear combination of the objective function and the constraints:
1(.. z) = (.)
n

i=1
z
i
c
i
(.)
The coefcients z
i
are called Lagrange multipliers. This tool makes it possible to state necessary and
sufcient conditions for a local minimum. The various algorithms in PROC NLP create sequences of
points, each of which is closer than the previous one to satisfying these conditions.
Assuming that the functions and c
i
are twice continuously differentiable, the point .
+
is a local
minimum of the nonlinear programming problem, if there exists a vector z
+
= (z
+
1
. . . . . z
+
n
) that
meets the following conditions.
1. First-order Karush-Kuhn-Tucker conditions:
c
i
(.
+
) = 0. i = 1. . . . . m
e
c
i
(.
+
) _ 0. z
+
i
_ 0. z
+
i
c
i
(.
+
) = 0. i = m
e
1. . . . . m
V
x
1(.
+
. z
+
) = 0
2. Second-order conditions: Each nonzero vector , R
n
that satises
,
T
V
x
c
i
(.
+
) = 0
_
i = 1. . . . . m
e
Vi {m
e
1. . . . . m : z
+
i
> 0]
also satises
,
T
V
2
x
1(.
+
. z
+
), > 0
Most of the algorithms to solve this problem attempt to nd a combination of vectors . and z for
which the gradient of the Lagrangian function with respect to . is zero.
Derivatives
The rst- and second-order conditions of optimality are based on rst and second derivatives of the
objective function and the constraints c
i
.
The gradient vector contains the rst derivatives of the objective function with respect to the
parameters .
1
. . . . . .
n
. as follows:
g(.) = V(.) =
_
d
d.
}
_
Criteria for Optimality ! 361
The n n symmetric Hessian matrix contains the second derivatives of the objective function with
respect to the parameters .
1
. . . . . .
n
. as follows:
G(.) = V
2
(.) =
_
d
2

d.
}
d.
k
_
For least squares problems, the m n Jacobian matrix contains the rst-order derivatives of the m
objective functions
i
(.) with respect to the parameters .
1
. . . . . .
n
. as follows:
J(.) = (V
1
. . . . . V
n
) =
_
d
i
d.
}
_
In the case of least squares problems, the crossproduct Jacobian
J
T
J =
_
n

i=1
d
i
d.
}
d
i
d.
k
_
is used as an approximate Hessian matrix. It is a very good approximation of the Hessian if the
residuals at the solution are small. (If the residuals are not sufciently small at the solution,
this approach may result in slow convergence.) The fact that it is possible to obtain Hessian
approximations for this problem that do not require any computation of second derivatives means
that least squares algorithms are more efcient than unconstrained optimization algorithms. Using
the vector (.) = (
1
(.). . . . .
n
(.))
T
of function values, PROC NLP computes the gradient g(.)
by
g(.) = J
T
(.)(.)
The mc n Jacobian matrix contains the rst-order derivatives of the mc nonlinear constraint
functions c
i
(.). i = 1. . . . . mc, with respect to the parameters .
1
. . . . . .
n
, as follows:
CJ(.) = (Vc
1
. . . . . Vc
mc
) =
_
dc
i
d.
}
_
PROC NLP provides three ways to compute derivatives:
v It computes analytical rst- and second-order derivatives of the objective function with
respect to the n variables .
}
.
v It computes rst- and second-order nite-difference approximations to the derivatives. For
more information, see the section Finite-Difference Approximations of Derivatives on
page 373.
v The user supplies formulas for analytical or numerical rst- and second-order derivatives of
the objective function in the GRADIENT, JACOBIAN, CRPJAC, and HESSIAN statements.
The JACNLC statement can be used to specify the derivatives for the nonlinear constraints.
362 ! Chapter 6: The NLP Procedure
Optimization Algorithms
There are three groups of optimization techniques available in PROC NLP. A particular optimizer
can be selected with the TECH= option in the PROC NLP statement.
Table 6.2 Karush-Kuhn-Tucker Conditions
Algorithm TECH=
Linear Complementarity Problem LICOMP
Quadratic Active Set Technique QUADAS
Trust-Region Method TRUREG
Newton-Raphson Method with Line Search NEWRAP
Newton-Raphson Method with Ridging NRRIDG
Quasi-Newton Methods (DBFGS, DDFP, BFGS, DFP) QUANEW
Double Dogleg Method (DBFGS, DDFP) DBLDOG
Conjugate Gradient Methods (PB, FR, PR, CD) CONGRA
Nelder-Mead Simplex Method NMSIMP
Levenberg-Marquardt Method LEVMAR
Hybrid Quasi-Newton Methods (DBFGS, DDFP) HYQUAN
Since no single optimization technique is invariably superior to others, PROC NLP provides a variety
of optimization techniques that work well in various circumstances. However, it is possible to devise
problems for which none of the techniques in PROC NLP can nd the correct solution. Moreover,
nonlinear optimization can be computationally expensive in terms of time and memory, so care must
be taken when matching an algorithm to a problem.
All optimization techniques in PROC NLP use O(n
2
) memory except the conjugate gradient methods,
which use only O(n) memory and are designed to optimize problems with many variables. Since the
techniques are iterative, they require the repeated computation of
v the function value (optimization criterion)
v the gradient vector (rst-order partial derivatives)
v for some techniques, the (approximate) Hessian matrix (second-order partial derivatives)
v values of linear and nonlinear constraints
v the rst-order partial derivatives (Jacobian) of nonlinear constraints
However, since each of the optimizers requires different derivatives and supports different types of
constraints, some computational efciencies can be gained. The following table shows, for each
optimization technique, which derivatives are needed (FOD: rst-order derivatives; SOD: second-
order derivatives) and what kinds of constraints (BC: boundary constraints; LIC: linear constraints;
NLC: nonlinear constraints) are supported.
Optimization Algorithms ! 363
Algorithm FOD SOD BC LIC NLC
LICOMP - - x x -
QUADAS - - x x -
TRUREG x x x x -
NEWRAP x x x x -
NRRIDG x x x x -
QUANEW x - x x x
DBLDOG x - x x -
CONGRA x - x x -
NMSIMP - - x x x
LEVMAR x - x x -
HYQUAN x - x x -
Preparation for Using Optimization Algorithms
It is rare that a problem is submitted to an optimization algorithm as is. By making a few changes
in your problem, you can reduce its complexity, which would increase the chance of convergence
and save execution time.
v Whenever possible, use linear functions instead of nonlinear functions. PROC NLP will
reward you with faster and more accurate solutions.
v Most optimization algorithms are based on quadratic approximations to nonlinear functions.
You should try to avoid the use of functions that cannot be properly approximated by quadratic
functions. Try to avoid the use of rational functions.
For example, the constraint
sin(.)
. 1
> 0
should be replaced by the equivalent constraint
sin(.)(. 1) > 0
and the constraint
sin(.)
. 1
= 1
should be replaced by the equivalent constraint
sin(.) (. 1) = 0
v Try to avoid the use of exponential functions, if possible.
v If you can reduce the complexity of your function by the addition of a small number of
variables, it may help the algorithm avoid stationary points.
364 ! Chapter 6: The NLP Procedure
v Provide the best starting point you can. A good starting point leads to better quadratic
approximations and faster convergence.
Choosing an Optimization Algorithm
The factors that go into choosing a particular optimizer for a particular problem are complex and may
involve trial and error. Several things must be taken into account. First, the structure of the problem
has to be considered: Is it quadratic? least squares? Does it have linear or nonlinear constraints?
Next, it is important to consider the type of derivatives of the objective function and the constraints
that are needed and whether these are analytically tractable or not. This section provides some
guidelines for making the right choices.
For many optimization problems, computing the gradient takes more computer time than computing
the function value, and computing the Hessian sometimes takes much more computer time and
memory than computing the gradient, especially when there are many decision variables. Optimiza-
tion techniques that do not use the Hessian usually require more iterations than techniques that do
use Hessian approximations (such as nite differences or BFGS update) and so are often slower.
Techniques that do not use Hessians at all tend to be slow and less reliable.
The derivative compiler is not efcient in the computation of second-order derivatives. For large
problems, memory and computer time can be saved by programming your own derivatives using the
GRADIENT, JACOBIAN, CRPJAC, HESSIAN, and JACNLC statements. If you are not able to
specify rst- and second-order derivatives of the objective function, you can rely on nite-difference
gradients and Hessian update formulas. This combination is frequently used and works very well
for small and medium problems. For large problems, you are advised not to use an optimization
technique that requires the computation of second derivatives.
The following provides some guidance for matching an algorithm to a particular problem.
v Quadratic Programming
QUADAS
LICOMP
v General Nonlinear Optimization
Nonlinear Constraints
+ Small Problems: NMSIMP
Not suitable for highly nonlinear problems or for problems with n > 20.
+ Medium Problems: QUANEW
Only Linear Constraints
+ Small Problems: TRUREG (NEWRAP, NRRIDG)
(n _ 40) where the Hessian matrix is not expensive to compute. Sometimes
NRRIDG can be faster than TRUREG, but TRUREG can be more stable. NRRIDG
needs only one matrix with n(n 1),2 double words; TRUREG and NEWRAP
need two such matrices.
Optimization Algorithms ! 365
+ Medium Problems: QUANEW (DBLDOG)
(n _ 200) where the objective function and the gradient are much faster to eval-
uate than the Hessian. QUANEW and DBLDOG in general need more iterations
than TRUREG, NRRIDG, and NEWRAP, but each iteration can be much faster.
QUANEW and DBLDOG need only the gradient to update an approximate Hessian.
QUANEW and DBLDOG need slightly less memory than TRUREG or NEWRAP
(essentially one matrix with n(n 1),2 double words).
+ Large Problems: CONGRA
(n > 200) where the objective function and the gradient can be computed much
faster than the Hessian and where too much memory is needed to store the (ap-
proximate) Hessian. CONGRA in general needs more iterations than QUANEW
or DBLDOG, but each iteration can be much faster. Since CONGRA needs only
a factor of n double-word memory, many large applications of PROC NLP can be
solved only by CONGRA.
+ No Derivatives: NMSIMP
(n _ 20) where derivatives are not continuous or are very difcult to compute.
v Least Squares Minimization
Small Problems: LEVMAR (HYQUAN)
(n _ 60) where the crossproduct Jacobian matrix is inexpensive to compute. In general,
LEVMAR is more reliable, but there are problems with high residuals where HYQUAN
can be faster than LEVMAR.
Medium Problems: QUANEW (DBLDOG)
(n _ 200) where the objective function and the gradient are much faster to evaluate than
the crossproduct Jacobian. QUANEW and DBLDOG in general need more iterations
than LEVMAR or HYQUAN, but each iteration can be much faster.
Large Problems: CONGRA
No Derivatives: NMSIMP
Quadratic Programming Method
The QUADAS and LICOMP algorithms can be used to minimize or maximize a quadratic objective
function,
(.) =
1
2
.
T
G. g
T
. c. with G
T
= G
subject to linear or boundary constraints
. _ b or l
}
_ .
}
_ u
}
where . = (.
1
. . . . . .
n
)
T
, g = (g
1
. . . . . g
n
)
T
, G is an n n symmetric matrix, is an m n
matrix of general linear constraints, and b = (b
1
. . . . . b
n
)
T
. The value of c modies only the value
of the objective function, not its derivatives, and the location of the optimizer .
+
does not depend on
the value of the constant term c. For QUADAS or LICOMP, the objective function must be specied
using the MINQUAD or MAXQUAD statement or using an INQUAD= data set.
366 ! Chapter 6: The NLP Procedure
In this case, derivatives do not need to be specied because the gradient vector
V(.) = G. g
and the n n Hessian matrix
V
2
(.) = G
are easily obtained from the data input.
Simple boundary and general linear constraints can be specied using the BOUNDS or LINCON
statement or an INQUAD= or INEST= data set.
General Quadratic Programming (QUADAS)
The QUADAS algorithm is an active set method that iteratively updates the QT decomposition of
the matrix
k
of active linear constraints and the Cholesky factor of the projected Hessian 7
T
k
G7
k
simultaneously. The update of active boundary and linear constraints is done separately; refer to
Gill et al. (1984). Here Q is an n
free
n
free
orthogonal matrix composed of vectors spanning the
null space 7 of
k
in its rst n
free
n
alc
columns and range space Y in its last n
alc
columns; T is
an n
alc
n
alc
triangular matrix of special form, t
i}
= 0 for i < n , where n
free
is the number
of free parameters (n minus the number of active boundary constraints), and n
alc
is the number of
active linear constraints. The Cholesky factor of the projected Hessian matrix 7
T
k
G7
k
and the QT
decomposition are updated simultaneously when the active set changes.
Linear Complementarity (LICOMP)
The LICOMP technique solves a quadratic problem as a linear complementarity problem. It can
be used only if G is positive (negative) semidenite for minimization (maximization) and if the
parameters are restricted to be positive.
This technique nds a point that meets the Karush-Kuhn-Tucker conditions by solving the linear
complementary problem
n = M: q
with constraints
n
T
: _ 0. n _ 0. : _ 0.
where
: =
_
.
z
_
M =
_
G
T
0
_
q =
_
g
b
_
Only the LCEPSILON= option can be used to specify a tolerance used in computations.
Optimization Algorithms ! 367
General Nonlinear Optimization
Trust-Region Optimization (TRUREG)
The trust region method uses the gradient g(.
(k)
) and Hessian matrix G(.
(k)
) and thus requires that
the objective function (.) have continuous rst- and second-order derivatives inside the feasible
region.
The trust region method iteratively optimizes a quadratic approximation to the nonlinear objective
function within a hyperelliptic trust region with radius ^that constrains the step length corresponding
to the quality of the quadratic approximation. The trust region method is implemented using Dennis,
Gay, and Welsch (1981), Gay (1983).
The trust region method performs well for small to medium problems and does not require many
function, gradient, and Hessian calls. If the computation of the Hessian matrix is computationally
expensive, use the UPDATE= option for update formulas (that gradually build the second-order
information in the Hessian). For larger problems, the conjugate gradient algorithm may be more
appropriate.
Newton-Raphson Optimization With Line-Search (NEWRAP)
The NEWRAP technique uses the gradient g(.
(k)
) and Hessian matrix G(.
(k)
) and thus requires
that the objective function have continuous rst- and second-order derivatives inside the feasible
region. If second-order derivatives are computed efciently and precisely, the NEWRAP method
may perform well for medium to large problems, and it does not need many function, gradient, and
Hessian calls.
This algorithm uses a pure Newton step when the Hessian is positive denite and when the Newton
step reduces the value of the objective function successfully. Otherwise, a combination of ridging
and line search is done to compute successful steps. If the Hessian is not positive denite, a multiple
of the identity matrix is added to the Hessian matrix to make it positive denite (Eskow and Schnabel
1991).
In each iteration, a line search is done along the search direction to nd an approximate optimum
of the objective function. The default line-search method uses quadratic interpolation and cubic
extrapolation (LIS=2).
Newton-Raphson Ridge Optimization (NRRIDG)
The NRRIDG technique uses the gradient g(.
(k)
) and Hessian matrix G(.
(k)
) and thus requires that
the objective function have continuous rst- and second-order derivatives inside the feasible region.
This algorithm uses a pure Newton step when the Hessian is positive denite and when the Newton
step reduces the value of the objective function successfully. If at least one of these two conditions
is not satised, a multiple of the identity matrix is added to the Hessian matrix. If this algorithm is
used for least squares problems, it performs a ridged Gauss-Newton minimization.
The NRRIDG method performs well for small to medium problems and does not need many function,
gradient, and Hessian calls. However, if the computation of the Hessian matrix is computationally
expensive, one of the (dual) quasi-Newton or conjugate gradient algorithms may be more efcient.
368 ! Chapter 6: The NLP Procedure
Since NRRIDG uses an orthogonal decomposition of the approximate Hessian, each iteration of
NRRIDG can be slower than that of NEWRAP, which works with Cholesky decomposition. However,
usually NRRIDG needs fewer iterations than NEWRAP.
Quasi-Newton Optimization (QUANEW)
The (dual) quasi-Newton method uses the gradient g(.
(k)
) and does not need to compute second-
order derivatives since they are approximated. It works well for medium to moderately large
optimization problems where the objective function and the gradient are much faster to compute than
the Hessian, but in general it requires more iterations than the techniques TRUREG, NEWRAP, and
NRRIDG, which compute second-order derivatives.
The QUANEW algorithm depends on whether or not there are nonlinear constraints.
Unconstrained or Linearly Constrained Problems If there are no nonlinear constraints,
QUANEW is either
v the original quasi-Newton algorithm that updates an approximation of the inverse Hessian, or
v the dual quasi-Newton algorithm that updates the Cholesky factor of an approximate Hessian
(default),
depending on the value of the UPDATE= option. For problems with general linear inequality
constraints, the dual quasi-Newton methods can be more efcient than the original ones.
Four update formulas can be specied with the UPDATE= option:
DBFGS performs the dual BFGS (Broyden, Fletcher, Goldfarb, & Shanno) update of the
Cholesky factor of the Hessian matrix. This is the default.
DDFP performs the dual DFP (Davidon, Fletcher, & Powell) update of the Cholesky
factor of the Hessian matrix.
BFGS performs the original BFGS (Broyden, Fletcher, Goldfarb, & Shanno) update of
the inverse Hessian matrix.
DFP performs the original DFP (Davidon, Fletcher, & Powell) update of the inverse
Hessian matrix.
In each iteration, a line search is done along the search direction to nd an approximate optimum.
The default line-search method uses quadratic interpolation and cubic extrapolation to obtain a step
length satisfying the Goldstein conditions. One of the Goldstein conditions can be violated if the
feasible region denes an upper limit of the step length. Violating the left-side Goldstein condition
can affect the positive deniteness of the quasi-Newton update. In those cases, either the update
is skipped or the iterations are restarted with an identity matrix resulting in the steepest descent or
ascent search direction. Line-search algorithms other than the default one can be specied with the
LINESEARCH= option.
Optimization Algorithms ! 369
Nonlinearly Constrained Problems The algorithm used for nonlinearly constrained quasi-
Newton optimization is an efcient modication of Powells (1978a, 1982b) Variable Metric Con-
strained WatchDog (VMCWD) algorithm. A similar but older algorithm (VF02AD) is part of the
Harwell library. Both VMCWD and VF02AD use Fletchers VE02AD algorithm (part of the Harwell
library) for positive-denite quadratic programming. The PROC NLP QUANEW implementation
uses a quadratic programming subroutine that updates and downdates the approximation of the
Cholesky factor when the active set changes. The nonlinear QUANEW algorithm is not a feasible-
point algorithm, and the value of the objective function need not decrease (minimization) or increase
(maximization) monotonically. Instead, the algorithm tries to reduce a linear combination of the
objective function and constraint violations, called the merit function.
The following are similarities and differences between this algorithm and the VMCWD algorithm:
v A modication of this algorithm can be performed by specifying VERSION=1, which replaces
the update of the Lagrange vector j with the original update of Powell (1978a, b) that is
used in VF02AD. This can be helpful for some applications with linearly dependent active
constraints.
v If the VERSION option is not specied or if VERSION=2 is specied, the evaluation of the
Lagrange vector j is performed in the same way as Powell (1982b) describes.
v Instead of updating an approximate Hessian matrix, this algorithm uses the dual BFGS (or
DFP) update that updates the Cholesky factor of an approximate Hessian. If the condition of
the updated matrix gets too bad, a restart is done with a positive diagonal matrix. At the end of
the rst iteration after each restart, the Cholesky factor is scaled.
v The Cholesky factor is loaded into the quadratic programming subroutine, automatically
ensuring positive deniteness of the problem. During the quadratic programming step, the
Cholesky factor of the projected Hessian matrix 7
T
k
G7
k
and the QT decomposition are
updated simultaneously when the active set changes. Refer to Gill et al. (1984) for more
information.
v The line-search strategy is very similar to that of Powell (1982b). However, this algorithm
does not call for derivatives during the line search, so the algorithm generally needs fewer
derivative calls than function calls. VMCWD always requires the same number of derivative
and function calls. Sometimes Powells line-search method uses steps that are too long. In
these cases, use the INSTEP= option to restrict the step length .
v The watchdog strategy is similar to that of Powell (1982b); however, it doesnt return auto-
matically after a xed number of iterations to a former better point. A return here is further
delayed if the observed function reduction is close to the expected function reduction of the
quadratic model.
v The Powell termination criterion still is used (as FCONV2) but the QUANEW implementation
uses two additional termination criteria (GCONV and ABSGCONV).
The nonlinear QUANEW algorithm needs the Jacobian matrix of the rst-order derivatives (con-
straints normals) of the constraints CJ(.).
370 ! Chapter 6: The NLP Procedure
You can specify two update formulas with the UPDATE= option:
DBFGS performs the dual BFGS update of the Cholesky factor of the Hessian matrix.
This is the default.
DDFP performs the dual DFP update of the Cholesky factor of the Hessian matrix.
This algorithm uses its own line-search technique. No options or parameters (except the INSTEP=
option) controlling the line search in the other algorithms apply here. In several applications, large
steps in the rst iterations were troublesome. You can use the INSTEP= option to impose an upper
bound for the step length during the rst ve iterations. You may also use the INHESSIAN= option
to specify a different starting approximation for the Hessian. Choosing simply the INHESSIAN op-
tion will use the Cholesky factor of a (possibly ridged) nite-difference approximation of the Hessian
to initialize the quasi-Newton update process. The values of the LCSINGULAR=, LCEPSILON=,
and LCDEACT= options, which control the processing of linear and boundary constraints, are valid
only for the quadratic programming subroutine used in each iteration of the nonlinear constraints
QUANEW algorithm.
Double Dogleg Optimization (DBLDOG)
The double dogleg optimization method combines the ideas of the quasi-Newton and trust region
methods. The double dogleg algorithmcomputes in each iteration the step s
(k)
as a linear combination
of the steepest descent or ascent search direction s
(k)
1
and a quasi-Newton search direction s
(k)
2
:
s
(k)
=
1
s
(k)
1

2
s
(k)
2
The step is requested to remain within a prespecied trust region radius; refer to Fletcher (1987,
p. 107). Thus, the DBLDOG subroutine uses the dual quasi-Newton update but does not perform a
line search. Two update formulas can be specied with the UPDATE= option:
DBFGS performs the dual BFGS (Broyden, Fletcher, Goldfarb, & Shanno) update of the
Cholesky factor of the Hessian matrix. This is the default.
DDFP performs the dual DFP (Davidon, Fletcher, & Powell) update of the Cholesky
factor of the Hessian matrix.
The double dogleg optimization technique works well for medium to moderately large optimization
problems where the objective function and the gradient are much faster to compute than the Hessian.
The implementation is based on Dennis and Mei (1979) and Gay (1983) but is extended for dealing
with boundary and linear constraints. DBLDOG generally needs more iterations than the techniques
TRUREG, NEWRAP, or NRRIDG that need second-order derivatives, but each of the DBLDOG
iterations is computationally cheap. Furthermore, DBLDOG needs only gradient calls for the update
of the Cholesky factor of an approximate Hessian.
Conjugate Gradient Optimization (CONGRA)
Second-order derivatives are not used by CONGRA. The CONGRA algorithm can be expensive
in function and gradient calls but needs only O(n) memory for unconstrained optimization. In
general, many iterations are needed to obtain a precise solution, but each of the CONGRA iterations
Optimization Algorithms ! 371
is computationally cheap. Four different update formulas for generating the conjugate directions can
be specied using the UPDATE= option:
PB performs the automatic restart update method of Powell (1977) and Beale (1972).
This is the default.
FR performs the Fletcher-Reeves update (Fletcher 1987).
PR performs the Polak-Ribiere update (Fletcher 1987).
CD performs a conjugate-descent update of Fletcher (1987).
The default value is UPDATE=PB, since it behaved best in most test examples. You are advised to
avoid the option UPDATE=CD, a it behaved worst in most test examples.
The CONGRA subroutine should be used for optimization problems with large n. For the un-
constrained or boundary constrained case, CONGRA needs only O(n) bytes of working memory,
whereas all other optimization methods require order O(n
2
) bytes of working memory. During n
successive iterations, uninterrupted by restarts or changes in the working set, the conjugate gradient
algorithm computes a cycle of n conjugate search directions. In each iteration, a line search is done
along the search direction to nd an approximate optimum of the objective function. The default
line-search method uses quadratic interpolation and cubic extrapolation to obtain a step length
satisfying the Goldstein conditions. One of the Goldstein conditions can be violated if the feasible
region denes an upper limit for the step length. Other line-search algorithms can be specied with
the LINESEARCH= option.
Nelder-Mead Simplex Optimization (NMSIMP)
The Nelder-Mead simplex method does not use any derivatives and does not assume that the objective
function has continuous derivatives. The objective function itself needs to be continuous. This
technique requires a large number of function evaluations. It is unlikely to give accurate results for
n ; 40.
Depending on the kind of constraints, one of the following Nelder-Mead simplex algorithms is used:
v unconstrained or only boundary constrained problems
The original Nelder-Mead simplex algorithm is implemented and extended to boundary
constraints. This algorithm does not compute the objective for infeasible points. This algorithm
is automatically invoked if the LINCON or NLINCON statement is not specied.
v general linearly constrained or nonlinearly constrained problems
A slightly modied version of Powells (1992) COBYLA (Constrained Optimization BY
Linear Approximations) implementation is used. This algorithm is automatically invoked if
either the LINCON or the NLINCON statement is specied.
The original Nelder-Mead algorithm cannot be used for general linear or nonlinear constraints
but can be faster for the unconstrained or boundary constrained case. The original Nelder-Mead
algorithm changes the shape of the simplex adapting the nonlinearities of the objective function
which contributes to an increased speed of convergence. The two NMSIMP subroutines use special
sets of termination criteria. For more details, refer to the section Termination Criteria on page 377.
372 ! Chapter 6: The NLP Procedure
Powells COBYLA Algorithm (COBYLA)
Powells COBYLA algorithm is a sequential trust region algorithm (originally with a monotonically
decreasing radius j of a spheric trust region) that tries to maintain a regular-shaped simplex over
the iterations. A small modication was made to the original algorithm that permits an increase of
the trust region radius j in special situations. A sequence of iterations is performed with a constant
trust region radius j until the computed objective function reduction is much less than the predicted
reduction. Then, the trust region radius j is reduced. The trust region radius is increased only if
the computed function reduction is relatively close to the predicted reduction and the simplex is
well-shaped. The start radius j
beg
and the nal radius j
end
can be specied using j
beg
=INSTEP and
j
end
=ABSXTOL. The convergence to small values of j
end
(high precision) may take many calls of
the function and constraint modules and may result in numerical problems. There are two main
reasons for the slow convergence of the COBYLA algorithm:
v Only linear approximations of the objective and constraint functions are used locally.
v Maintaining the regular-shaped simplex and not adapting its shape to nonlinearities yields very
small simplices for highly nonlinear functions (for example, fourth-order polynomials).
Nonlinear Least Squares Optimization
Levenberg-Marquardt Least Squares Method (LEVMAR)
The Levenberg-Marquardt method is a modication of the trust region method for nonlinear least
squares problems and is implemented as in Mor (1978).
This is the recommended algorithm for small to medium least squares problems. Large least squares
problems can be transformed into minimization problems, which can be processed with conjugate
gradient or (dual) quasi-Newton techniques. In each iteration, LEVMAR solves a quadratically
constrained quadratic minimization problem that restricts the step to stay at the surface of or inside an
n- dimensional elliptical (or spherical) trust region. In each iteration, LEVMAR uses the crossproduct
Jacobian matrix J
T
J as an approximate Hessian matrix.
Hybrid Quasi-Newton Least Squares Methods (HYQUAN)
In each iteration of one of the Fletcher and Xu (1987) (refer also to Al-Baali and Fletcher (1985,1986))
hybrid quasi-Newton methods, a criterion is used to decide whether a Gauss-Newton or a dual quasi-
Newton search direction is appropriate. The VERSION= option can be used to choose one of three
criteria (HY1, HY2, HY3) proposed by Fletcher and Xu (1987). The default is VERSION=2; that is,
HY2. In each iteration, HYQUAN computes the crossproduct Jacobian (used for the Gauss-Newton
step), updates the Cholesky factor of an approximate Hessian (used for the quasi-Newton step), and
does a line search to compute an approximate minimum along the search direction. The default
line-search technique used by HYQUAN is especially designed for least squares problems (refer to
Lindstrm and Wedin (1984) and Al-Baali and Fletcher (1986)). Using the LINESEARCH= option
you can choose a different line-search algorithm than the default one.
Finite-Difference Approximations of Derivatives ! 373
Two update formulas can be specied with the UPDATE= option:
DBFGS performs the dual BFGS (Broyden, Fletcher, Goldfarb, and Shanno) update of
the Cholesky factor of the Hessian matrix. This is the default.
DDFP performs the dual DFP (Davidon, Fletcher, and Powell) update of the Cholesky
factor of the Hessian matrix.
The HYQUAN subroutine needs about the same amount of working memory as the LEVMAR
algorithm. In most applications, LEVMAR seems to be superior to HYQUAN, and using HYQUAN
is recommended only when problems are experienced with the performance of LEVMAR.
Finite-Difference Approximations of Derivatives
The FD= and FDHESSIAN= options specify the use of nite-difference approximations of the
derivatives. The FD= option species that all derivatives are approximated using function evaluations,
and the FDHESSIAN= option species that second-order derivatives are approximated using gradient
evaluations.
Computing derivatives by nite-difference approximations can be very time-consuming, especially
for second-order derivatives based only on values of the objective function ( FD= option). If
analytical derivatives are difcult to obtain (for example, if a function is computed by an iterative
process), you might consider one of the optimization techniques that uses rst-order derivatives only
(TECH=QUANEW, TECH=DBLDOG, or TECH=CONGRA).
Forward-Difference Approximations
The forward-difference derivative approximations consume less computer time but are usually not as
precise as those using central-difference formulas.
v First-order derivatives: n additional function calls are needed:
g
i
=
d
d.
i
=
(. h
i
e
i
) (.)
h
i
v Second-order derivatives based on function calls only (Dennis and Schnabel 1983, p. 80, 104):
for dense Hessian, n(n 3),2 additional function calls are needed:
d
2

d.
i
d.
}
=
(. h
i
e
i
h
}
e
}
) (. h
i
e
i
) (. h
}
e
}
) (.)
h
}
v Second-order derivatives based on gradient calls (Dennis and Schnabel 1983, p. 103): n
additional gradient calls are needed:
d
2

d.
i
d.
}
=
g
i
(. h
}
e
}
) g
i
(.)
2h
}

g
}
(. h
i
e
i
) g
}
(.)
2h
i
374 ! Chapter 6: The NLP Procedure
Central-Difference Approximations
v First-order derivatives: 2n additional function calls are needed:
g
i
=
d
d.
i
=
(. h
i
e
i
) (. h
i
e
i
)
2h
i
v Second-order derivatives based on function calls only (Abramowitz and Stegun 1972, p. 884):
for dense Hessian, 2n(n 1) additional function calls are needed:
d
2

d.
2
i
=
(. 2h
i
e
i
) 16(. h
i
e
i
) 30(.) 16(. h
i
e
i
) (. 2h
i
e
i
)
12h
2
i
d
2

d.
i
d.
}
=
(. h
i
e
i
h
}
e
}
) (. h
i
e
i
h
}
e
}
) (. h
i
e
i
h
}
e
}
) (. h
i
e
i
h
}
e
}
)
4h
i
h
}
v Second-order derivatives based on gradient: 2n additional gradient calls are needed:
d
2

d.
i
d.
}
=
g
i
(. h
}
e
}
) g
i
(. h
}
e
}
)
4h
}

g
}
(. h
i
e
i
) g
}
(. h
i
e
i
)
4h
i
The FDIGITS= and CDIGITS= options can be used for specifying the number of accurate digits in
the evaluation of objective function and nonlinear constraints. These specications are helpful in
determining an appropriate interval length h to be used in the nite-difference formulas.
The FDINT= option species whether the nite-difference intervals h should be computed using
an algorithm of Gill, Murray, Saunders, and Wright (1983) or based only on the information of the
FDIGITS= and CDIGITS= options. For FDINT=OBJ, the interval h is based on the behavior of
the objective function; for FDINT=CON, the interval h is based on the behavior of the nonlinear
constraints functions; and for FDINT=ALL, the interval h is based on the behaviors of both the
objective function and the nonlinear constraints functions. Note that the algorithm of Gill, Murray,
Saunders, and Wright (1983) to compute the nite-difference intervals h
}
can be very expensive in
the number of function calls. If the FDINT= option is specied, it is currently performed twice, the
rst time before the optimization process starts and the second time after the optimization terminates.
If FDINT= is not specied, the step lengths h
}
, = 1. . . . . n, are dened as follows:
v for the forward-difference approximation of rst-order derivatives using function calls and
second-order derivatives using gradient calls: h
}
=
2
_
j
}
(1 [.
}
[),
v for the forward-difference approximation of second-order derivatives that use only function
calls and all central-difference formulas: h
}
=
3
_
j
}
(1 [.
}
[),
where j is dened using the FDIGITS= option:
v If the number of accurate digits is specied with FDIGITS=r, j is set to 10
-i
.
v If FDIGITS= is not specied, j is set to the machine precision c.
Hessian and CRP Jacobian Scaling ! 375
For FDINT=OBJ and FDINT=ALL, the FDIGITS= specication is used in computing the forward
and central nite-difference intervals.
If the problem has nonlinear constraints and the FD= option is specied, the rst-order formulas are
used to compute nite-difference approximations of the Jacobian matrix JC(.). You can use the
CDIGITS= option to specify the number of accurate digits in the constraint evaluations to dene the
step lengths h
}
, = 1. . . . . n. For FDINT=CON and FDINT=ALL, the CDIGITS= specication is
used in computing the forward and central nite-difference intervals.
NOTE: If you are unable to specify analytic derivatives and the nite-difference approximations
provided by PROC NLP are not good enough to solve your problem, you may program better nite-
difference approximations using the GRADIENT, JACOBIAN, CRPJAC, or HESSIAN statement
and the program statements.
Hessian and CRP Jacobian Scaling
The rows and columns of the Hessian and crossproduct Jacobian matrix can be scaled when using the
trust region, Newton-Raphson, double dogleg, and Levenberg-Marquardt optimization techniques.
Each element G
i,}
, i. = 1. . . . . n. is divided by the scaling factor J
i
J
}
, where the scaling vector
J = (J
1
. . . . . J
n
) is iteratively updated in a way specied by the HESCAL=i option, as follows:
i = 0 No scaling is done (equivalent to J
i
= 1).
i = 0 First iteration and each restart iteration:
J
(0)
i
=
_
max([G
(0)
i,i
[. c)
i = 1 refer to Mor (1978):
J
(k1)
i
= max
_
J
(k)
i
.
_
max([G
(k)
i,i
[. c)
_
i = 2 refer to Dennis, Gay, and Welsch (1981):
J
(k1)
i
= max
_
0.6J
(k)
i
.
_
max([G
(k)
i,i
[. c)
_
i = 3 J
i
is reset in each iteration:
J
(k1)
i
=
_
max([G
(k)
i,i
[. c)
where c is the relative machine precision or, equivalently, the largest double precision value that
when added to 1 results in 1.
376 ! Chapter 6: The NLP Procedure
Testing the Gradient Specication
There are three main ways to check the correctness of derivative specications:
v Specify the FD= or FDHESSIAN= option in the PROC NLP statement to compute nite-
difference approximations of rst- and second-order derivatives. In many applications, the
nite-difference approximations are computed with high precision and do not differ too much
from the derivatives that are computed by specied formulas.
v Specify the GRADCHECK option in the PROC NLP statement to compute and display a test
vector and a test matrix of the gradient values at the starting point .
(0)
by the method of Wolfe
(1982). If you do not specify the GRADCHECK option, a fast derivative test identical to the
GRADCHECK=FAST specication is done by default.
v If the default analytical derivative compiler is used or if derivatives are specied using the
GRADIENT or JACOBIAN statement, the gradient or Jacobian computed at the initial point
.
(0)
is tested by default using nite-difference approximations. In some examples, the relative
test can show signicant differences between the two forms of derivatives, resulting in a
warning message indicating that the specied derivatives could be wrong, even if they are
correct. This happens especially in cases where the magnitude of the gradient at the starting
point .
(0)
is small.
The algorithm of Wolfe (1982) is used to check whether the gradient g(.) specied by a GRADIENT
statement (or indirectly by a JACOBIAN statement) is appropriate for the objective function (.)
specied by the program statements.
Using function and gradient evaluations in the neighborhood of the starting point .
(0)
, second
derivatives are approximated by nite-difference formulas. Forward differences of gradient values
are used to approximate the Hessian element G
}k
,
G
}k
~ H
}k
=
g
}
(. e
k
) g
}
(.)

where is a small step length and e

k
= (0. . . . . 0. 1. 0. . . . . 0)
T
is the unit vector along the kth
coordinate axis. The test vector s, with
s
}
= H
}}

2

_
(. e
}
) (.)

g
}
(.)
_
contains the differences between two sets of nite-difference approximations for the diagonal
elements of the Hessian matrix
G
}}
= d
2
(.
(0)
),d.
2
}
. = 1. . . . . n
The test matrix ^H contains the absolute differences of symmetric elements in the approximate
Hessian [H
}k
H
k}
[, . k = 1. . . . . n, generated by forward differences of the gradient elements.
If the specication of the rst derivatives is correct, the elements of the test vector and test matrix
should be relatively small. The location of large elements in the test matrix points to erroneous
coordinates in the gradient specication. For very large optimization problems, this algorithm can be
too expensive in terms of computer time and memory.
Termination Criteria ! 377
Termination Criteria
All optimization techniques stop iterating at .
(k)
if at least one of a set of termination criteria is
satised. PROC NLP also terminates if the point .
(k)
is fully constrained by n linearly independent
active linear or boundary constraints, and all Lagrange multiplier estimates of active inequality
constraints are greater than a small negative tolerance.
Since the Nelder-Mead simplex algorithm does not use derivatives, no termination criterion is
available based on the gradient of the objective function. Powells COBYLA algorithm uses only one
more termination criterion. COBYLA is a trust region algorithm that sequentially reduces the radius
j of a spherical trust region from a start radius j
beg
= INSTEP to the nal radius j
end
= ABSXTOL.
The default value is j
end
= 1e4. The convergence to small values of j
end
(high precision) may take
many calls of the function and constraint modules and may result in numerical problems.
In some applications, the small default value of the ABSGCONV= criterion is too difcult to
satisfy for some of the optimization techniques. This occurs most often when nite-difference
approximations of derivatives are used.
The default setting for the GCONV= option sometimes leads to early termination far from the
location of the optimum. This is especially true for the special form of this criterion used in the
CONGRA optimization.
The QUANEW algorithm for nonlinearly constrained optimization does not monotonically reduce
the value of either the objective function or some kind of merit function which combines objective
and constraint functions. Furthermore, the algorithm uses the watchdog technique with backtracking
(Chamberlain et al. 1982). Therefore, no termination criteria were implemented that are based on
the values (. or ) of successive iterations. In addition to the criteria used by all optimization
techniques, three more termination criteria are currently available. They are based on satisfying the
Karush-Kuhn-Tucker conditions, which require that the gradient of the Lagrange function is zero at
the optimal point (.
+
. z
+
):
V
x
1(.
+
. z
+
) = 0
For more information, refer to the section Criteria for Optimality on page 359.
Active Set Methods
The parameter vector . R
n
may be subject to a set of m linear equality and inequality constraints:
n

}=1
a
i}
.
}
= b
i
. i = 1. . . . . m
e
n

}=1
a
i}
.
}
_ b
i
. i = m
e
1. . . . . m
The coefcients a
i}
and right-hand sides b
i
of the equality and inequality constraints are collected in
the m n matrix and the mvector b.
378 ! Chapter 6: The NLP Procedure
The m linear constraints dene a feasible region G in R
n
that must contain the point .
+
that
minimizes the problem. If the feasible region G is empty, no solution to the optimization problem
exists.
All optimization techniques in PROC NLP (except those processing nonlinear constraints) are active
set methods. The iteration starts with a feasible point .
(0)
, which either is provided by the user or
can be computed by the Schittkowski and Stoer (1979) algorithm implemented in PROC NLP. The
algorithm then moves from one feasible point .
(k-1)
to a better feasible point .
(k)
along a feasible
search direction s
(k)
:
.
(k)
= .
(k-1)

(k)
s
(k)
.
(k)
> 0
Theoretically, the path of points .
(k)
never leaves the feasible region G of the optimization problem,
but it can hit its boundaries. The active set A
(k)
of point .
(k)
is dened as the index set of all linear
equality constraints and those inequality constraints that are satised at .
(k)
. If no constraint is active
for .
(k)
, the point is located in the interior of G, and the active set A
(k)
is empty. If the point .
(k)
in
iteration k hits the boundary of inequality constraint i , this constraint i becomes active and is added
to A
(k)
. Each equality or active inequality constraint reduces the dimension (degrees of freedom) of
the optimization problem.
In practice, the active constraints can be satised only with nite precision. The LCEPSILON=r
option species the range for active and violated linear constraints. If the point .
(k)
satises the
condition

}=1
a
i}
.
(k)
}
b
i

_ t
where t = r ([b
i
[ 1), the constraint i is recognized as an active constraint. Otherwise, the
constraint i is either an inactive inequality or a violated inequality or equality constraint. Due to
rounding errors in computing the projected search direction, error can be accumulated so that an
iterate .
(k)
steps out of the feasible region. In those cases, PROC NLP may try to pull the iterate
.
(k)
into the feasible region. However, in some cases the algorithm needs to increase the feasible
region by increasing the LCEPSILON=r value. If this happens it is indicated by a message displayed
in the log output.
If you cannot expect an improvement in the value of the objective function by moving from an active
constraint back into the interior of the feasible region, you use this inequality constraint as an equality
constraint in the next iteration. That means the active set A
(k1)
still contains the constraint i .
Otherwise you release the active inequality constraint and increase the dimension of the optimization
problem in the next iteration.
A serious numerical problem can arise when some of the active constraints become (nearly) linearly
dependent. Linearly dependent equality constraints are removed before entering the optimization.
You can use the LCSINGULAR= option to specify a criterion r used in the update of the QR
decomposition that decides whether an active constraint is linearly dependent relative to a set of
other active constraints.
If the nal parameter set .
+
is subjected to n
act
linear equality or active inequality constraints, the
QR decomposition of the n n
act
matrix

T
of the linear constraints is computed by

T
= Q1,
where Q is an nn orthogonal matrix and 1 is an nn
act
upper triangular matrix. The n columns of
matrix Q can be separated into two matrices, Q = Y. 7|, where Y contains the rst n
act
orthogonal
Feasible Starting Point ! 379
columns of Q and 7 contains the last n n
act
orthogonal columns of Q. The n (n n
act
)
column-orthogonal matrix 7 is also called the nullspace matrix of the active linear constraints

T
.
The n n
act
columns of the n (n n
act
) matrix 7 form a basis orthogonal to the rows of the
n
act
n matrix

.
At the end of the iteration process, the PROC NLP can display the projected gradient
g
7
= 7
T
g
In the case of boundary constrained optimization, the elements of the projected gradient correspond
to the gradient elements of the free parameters. A necessary condition for .
+
to be a local minimum
of the optimization problem is
g
7
(.
+
) = 7
T
g(.
+
) = 0
The symmetric n
act
n
act
matrix
G
7
= 7
T
G7
is called a projected Hessian matrix. A second-order necessary condition for .
+
to be a local
minimizer requires that the projected Hessian matrix is positive semidenite. If available, the
projected gradient and projected Hessian matrix can be displayed and written in an OUTEST= data
set.
Those elements of the n
oct
vector of rst-order estimates of Lagrange multipliers
z = (

T
)
-1

77
T
g
which correspond to active inequality constraints indicate whether an improvement of the objective
function can be obtained by releasing this active constraint. For minimization (maximization), a
signicant negative (positive) Lagrange multiplier indicates that a possible reduction (increase) of
the objective function can be obtained by releasing this active linear constraint. The LCDEACT=r
option can be used to specify a threshold r for the Lagrange multiplier that decides whether an active
inequality constraint remains active or can be deactivated. The Lagrange multipliers are displayed
(and written in an OUTEST= data set) only if linear constraints are active at the solution .
+
. (In
the case of boundary-constrained optimization, the Lagrange multipliers for active lower (upper)
constraints are the negative (positive) gradient elements corresponding to the active parameters.)
Feasible Starting Point
Two algorithms are used to obtain a feasible starting point.
v When only boundary constraints are specied:
If the parameter .
}
, 1 _ _ n, violates a two-sided boundary constraint (or an equality
constraint) l
}
_ .
}
_ u
}
, the parameter is given a new value inside the feasible interval,
as follows:
.
}
=
_

_
l
}
. if u
}
_ l
}
l
}

1
2
(u
}
l
}
). if u
}
l
}
< 4
l
}

1
10
(u
}
l
}
). if u
}
l
}
_ 4
380 ! Chapter 6: The NLP Procedure
If the parameter .
}
, 1 _ _ n, violates a one-sided boundary constraint l
}
_ .
}
or
.
}
_ u
}
, the parameter is given a new value near the violated boundary, as follows:
.
}
=
_
l
}
max(1.
1
10
l
}
). if .
}
< l
}
u
}
max(1.
1
10
u
}
). if .
}
> u
}
v When general linear constraints are specied, the algorithm of Schittkowski and Stoer (1979)
computes a feasible point, which may be quite far from a user-specied infeasible point.
Line-Search Methods
In each iteration k, the (dual) quasi-Newton, hybrid quasi-Newton, conjugate gradient, and Newton-
Raphson minimization techniques use iterative line-search algorithms that try to optimize a linear,
quadratic, or cubic approximation of along a feasible descent search direction s
(k)
.
(k1)
= .
(k)

(k)
s
(k)
.
(k)
> 0
by computing an approximately optimal scalar
(k)
.
Therefore, a line-search algorithm is an iterative process that optimizes a nonlinear function =
() of one parameter () within each iteration k of the optimization technique, which itself tries
to optimize a linear or quadratic approximation of the nonlinear objective function = (.) of n
parameters .. Since the outside iteration process is based only on the approximation of the objective
function, the inside iteration of the line-search algorithm does not have to be perfect. Usually, the
choice of signicantly reduces (in a minimization) the objective function. Criteria often used for
termination of line-search algorithms are the Goldstein conditions (refer to Fletcher (1987)).
Various line-search algorithms can be selected using the LINESEARCH= option. The line-search
method LINESEARCH=2 seems to be superior when function evaluation consumes signicantly
less computation time than gradient evaluation. Therefore, LINESEARCH=2 is the default value for
Newton-Raphson, (dual) quasi-Newton, and conjugate gradient optimizations.
A special default line-search algorithm for TECH=HYQUAN is useful only for least squares problems
and cannot be chosen by the LINESEARCH= option. This method uses three columns of the m n
Jacobian matrix, which for large m can require more memory than using the algorithms designated
by LINESEARCH=1 through LINESEARCH=8.
The line-search methods LINESEARCH=2 and LINESEARCH=3 can be modied to exact line
search by using the LSPRECISION= option (specifying the o parameter in Fletcher (1987)). The
line-search methods LINESEARCH=1, LINESEARCH=2, and LINESEARCH=3 satisfy the left-
hand-side and right-hand-side Goldstein conditions (refer to Fletcher (1987)). When derivatives are
available, the line-search methods LINESEARCH=6, LINESEARCH=7, and LINESEARCH=8 try
to satisfy the right-hand-side Goldstein condition; if derivatives are not available, these line-search
algorithms use only function calls.
Restricting the Step Length ! 381
Restricting the Step Length
Almost all line-search algorithms use iterative extrapolation techniques which can easily lead them
to (feasible) points where the objective function is no longer dened. (e.g., resulting in indenite
matrices for ML estimation) or difcult to compute (e.g., resulting in oating point overows).
Therefore, PROC NLP provides options restricting the step length or trust region radius ^,
especially during the rst main iterations.
The inner product g
T
s of the gradient g and the search direction s is the slope of () = (. s)
along the search direction s. The default starting value
(0)
=
(k,0)
in each line-search algorithm
(min
>0
(. s)) during the main iteration k is computed in three steps:
1. The rst step uses either the difference J = [
(k)

(k-1)
[ of the function values during
the last two consecutive iterations or the nal step length value
_
of the last iteration k 1 to
compute a rst value of
(0)
1
.
v Not using the DAMPSTEP=r option:

(0)
1
=
_

_
step. if 0.1 _ step _ 10
10. if step > 10
0.1. if step < 0.1
with
step =
_
J,[g
T
s[. if [g
T
s[ _ c max(100df . 1)
1. otherwise
This value of
(0)
1
can be too large and lead to a difcult or impossible function evaluation,
especially for highly nonlinear functions such as the EXP function.
v Using the DAMPSTEP=r option:

(0)
1
= min(1. r
_
)
The initial value for the new step length can be no larger than r times the nal step length

_
of the previous iteration. The default value is r = 2.
2. During the rst ve iterations, the second step enables you to reduce
(0)
1
to a smaller starting
value
(0)
2
using the INSTEP=r option:

(0)
2
= min(
(0)
1
. r)
After more than ve iterations,
(0)
2
is set to
(0)
1
.
3. The third step can further reduce the step length by

(0)
3
= min(
(0)
2
. min(10. u))
where u is the maximum length of a step inside the feasible region.
382 ! Chapter 6: The NLP Procedure
The INSTEP=r option lets you specify a smaller or larger radius ^ of the trust region used in
the rst iteration of the trust region, double dogleg, and Levenberg-Marquardt algorithms. The
default initial trust region radius ^
(0)
is the length of the scaled gradient (Mor 1978). This step
corresponds to the default radius factor of r = 1. In most practical applications of the TRUREG,
DBLDOG, and LEVMAR algorithms, this choice is successful. However, for bad initial values and
highly nonlinear objective functions (such as the EXP function), the default start radius can result
in arithmetic overows. If this happens, you may try decreasing values of INSTEP=r, 0 < r < 1,
until the iteration starts successfully. A small factor r also affects the trust region radius ^
(k1)
of the next steps because the radius is changed in each iteration by a factor 0 < c _ 4, depending
on the ratio j expressing the goodness of quadratic function approximation. Reducing the radius
^ corresponds to increasing the ridge parameter z, producing smaller steps directed more closely
toward the (negative) gradient direction.
Computational Problems
First Iteration Overows
If you use bad initial values for the parameters, the computation of the value of the objective function
(and its derivatives) can lead to arithmetic overows in the rst iteration. The line-search algorithms
that work with cubic extrapolation are especially sensitive to arithmetic overows. If an overow
occurs with an optimization technique that uses line search, you can use the INSTEP= option to
reduce the length of the rst trial step during the line search of the rst ve iterations or use the
DAMPSTEP or MAXSTEP= option to restrict the step length of the initial in subsequent iterations.
If an arithmetic overow occurs in the rst iteration of the trust region, double dogleg, or Levenberg-
Marquardt algorithm, you can use the INSTEP= option to reduce the default trust region radius of
the rst iteration. You can also change the minimization technique or the line-search method. If none
of these methods helps, consider the following actions:
v scale the parameters
v provide better initial values
v use boundary constraints to avoid the region where overows may happen
v change the algorithm (specied in program statements) which computes the objective function
Problems in Evaluating the Objective Function
The starting point .
(0)
must be a point that can be evaluated by all the functions involved in your
problem. However, during optimization the optimizer may iterate to a point .
(k)
where the objective
function or nonlinear constraint functions and their derivatives cannot be evaluated. If you can
identify the problematic region, you can prevent the algorithm from reaching it by adding another
constraint to the problem. Another possibility is a modication of the objective function that will
produce a large, undesired function value. As a result, the optimization algorithm reduces the step
Computational Problems ! 383
length and stays closer to the point that has been evaluated successfully in the previous iteration. For
more information, refer to the section Missing Values in Program Statements on page 399.
Problems with Quasi-Newton Methods for Nonlinear Constraints
The sequential quadratic programming algorithm in QUANEW, which is used for solving nonlinearly
constrained problems, can have problems updating the Lagrange multiplier vector j. This usually
results in very high values of the Lagrangian function and in watchdog restarts indicated in the
iteration history. If this happens, there are three actions you can try:
v By default, the Lagrange vector j is evaluated in the same way as Powell (1982b) describes.
This corresponds to VERSION=2. By specifying VERSION=1, a modication of this al-
gorithm replaces the update of the Lagrange vector j with the original update of Powell
(1978a, b), which is used in VF02AD.
v You can use the INSTEP= option to impose an upper bound for the step length during the
rst ve iterations.
v You can use the INHESSIAN= option to specify a different starting approximation for the
Hessian. Choosing only the INHESSIAN option will use the Cholesky factor of a (possibly
ridged) nite-difference approximation of the Hessian to initialize the quasi-Newton update
process.
Other Convergence Difculties
There are a number of things to try if the optimizer fails to converge.
v Check the derivative specication:
If derivatives are specied by using the GRADIENT, HESSIAN, JACOBIAN, CRPJAC,
or JACNLC statement, you can compare the specied derivatives with those computed by
nite-difference approximations (specifying the FD and FDHESSIAN option). Use the
GRADCHECK option to check if the gradient g is correct. For more information, refer to the
section Testing the Gradient Specication on page 376.
v Forward-difference derivatives specied with the FD= or FDHESSIAN= option may not be
precise enough to satisfy strong gradient termination criteria. You may need to specify the
more expensive central-difference formulas or use analytical derivatives. The nite-difference
intervals may be too small or too big and the nite-difference derivatives may be erroneous.
You can specify the FDINT= option to compute better nite-difference intervals.
v Change the optimization technique:
For example, if you use the default TECH=LEVMAR, you can
change to TECH=QUANEW or to TECH=NRRIDG
run some iterations with TECH=CONGRA, write the results in an OUTEST= data set,
and use them as initial values specied by an INEST= data set in a second run with a
different TECH= technique
384 ! Chapter 6: The NLP Procedure
v Change or modify the update technique and the line-search algorithm:
This method applies only to TECH=QUANEW, TECH=HYQUAN, or TECH=CONGRA. For
example, if you use the default update formula and the default line-search algorithm, you can
change the update formula with the UPDATE= option
change the line-search algorithm with the LINESEARCH= option
specify a more precise line search with the LSPRECISION= option, if you use LINE-
SEARCH=2 or LINESEARCH=3
v Change the initial values by using a grid search specication to obtain a set of good feasible
starting values.
Convergence to Stationary Point
The (projected) gradient at a stationary point is zero and that results in a zero step length. The
stopping criteria are satised.
There are two ways to avoid this situation:
v Use the DECVAR statement to specify a grid of feasible starting points.
v Use the OPTCHECK= option to avoid terminating at the stationary point.
The signs of the eigenvalues of the (reduced) Hessian matrix contain information regarding a
stationary point:
v If all eigenvalues are positive, the Hessian matrix is positive denite and the point is a minimum
point.
v If some of the eigenvalues are positive and all remaining eigenvalues are zero, the Hessian
matrix is positive semidenite and the point is a minimum or saddle point.
v If all eigenvalues are negative, the Hessian matrix is negative denite and the point is a
maximum point.
v If some of the eigenvalues are negative and all remaining eigenvalues are zero, the Hessian
matrix is negative semidenite and the point is a maximum or saddle point.
v If all eigenvalues are zero, the point can be a minimum, maximum, or saddle point.
Precision of Solution
In some applications, PROC NLP may result in parameter estimates that are not precise enough.
Usually this means that the procedure terminated too early at a point too far from the optimal point.
The termination criteria dene the size of the termination region around the optimal point. Any point
inside this region can be accepted for terminating the optimization process. The default values of
the termination criteria are set to satisfy a reasonable compromise between the computational effort
Covariance Matrix ! 385
(computer time) and the precision of the computed estimates for the most common applications.
However, there are a number of circumstances where the default values of the termination criteria
specify a region that is either too large or too small. If the termination region is too large, it can
contain points with low precision. In such cases, you should inspect the log or list output to nd
the message stating which termination criterion terminated the optimization process. In many
applications, you can obtain a solution with higher precision by simply using the old parameter
estimates as starting values in a subsequent run where you specify a smaller value for the termination
criterion that was satised at the previous run.
If the termination region is too small, the optimization process may take longer to nd a point
inside such a region or may not even nd such a point due to rounding errors in function values
and derivatives. This can easily happen in applications where nite-difference approximations of
derivatives are used and the GCONV and ABSGCONV termination criteria are too small to respect
rounding errors in the gradient values.
Covariance Matrix
The COV= option must be specied to compute an approximate covariance matrix for the parameter
estimates under asymptotic theory for least squares, maximum-likelihood, or Bayesian estimation,
with or without corrections for degrees of freedom as specied by the VARDEF= option.
Two groups of six different forms of covariance matrices (and therefore approximate standard errors)
can be computed corresponding to the following two situations:
v The LSQ statement is specied, which means that least squares estimates are being computed:
min (.) =
n

i=1

2
i
(.)
v The MIN or MAX statement is specied, which means that maximum-likelihood or Bayesian
estimates are being computed:
opt (.) =
n

i=1

i
(.)
where opt is either min or max.
In either case, the following matrices are used:
G = V
2
(.)
J( ) = (V
1
. . . . . V
n
) =
_
d
i
d.
}
_
JJ( ) = J( )
T
J( )
V = J( )
T
diag(
2
i
)J( )
W = J( )
T
diag(
|
i
)J( )
386 ! Chapter 6: The NLP Procedure
where

|
i
=
_
0. if
i
= 0
1,
i
. otherwise
For unconstrained minimization, or when none of the nal parameter estimates are subjected to linear
equality or active inequality constraints, the formulas of the six types of covariance matrices are as
follows:
Table 6.3 Central-Difference Approximations
COV MIN or MAX Statement LSQ Statement
1 M (_NOBS_,J)G
-1
JJ( )G
-1
(_NOBS_,J)G
-1
VG
-1
2 H (_NOBS_,J)G
-1
o
2
G
-1
3 J (1,J)W
-1
o
2
JJ( )
-1
4 B (1,J)G
-1
WG
-1
o
2
G
-1
JJ( )G
-1
5 E (_NOBS_,J)JJ( )
-1
(1,J)V
-1
6 U (_NOBS_,J)W
-1
JJ( )W
-1
(_NOBS_,J)JJ( )
-1
VJJ( )
-1
The value of J depends on the VARDEF= option and on the value of the _NOBS_ variable:
J =
_
max(1. _NOBS_ _DF_). for VARDEF=DF
_NOBS_. for VARDEF=N
where _DF_ is either set in the program statements or set by default to n (the number of parameters)
and _NOBS_ is either set in the program statements or set by default to nobs mfun, where nobs is
the number of observations in the data set and mfun is the number of functions listed in the LSQ,
MIN, or MAX statement.
The value o
2
depends on the specication of the SIGSQ= option and on the value of J:
o
2
=
_
sq _NOTS_,J. if SIGSQ=sq is specied
2(.
+
),J. if SIGSQ= is not specied
where (.
+
) is the value of the objective function at the optimal parameter estimates .
+
.
The two groups of formulas distinguish between two situations:
v For least squares estimates, the error variance can be estimated from the objective function
value and is used in three of the six different forms of covariance matrices. If you have an
independent estimate of the error variance, you can specify it with the SIGSQ= option.
v For maximum-likelihood or Bayesian estimates, the objective function should be the logarithm
of the likelihood or of the posterior density when using the MAX statement.
For minimization, the inversion of the matrices in these formulas is done so that negative eigenvalues
are considered zero, resulting always in a positive semidenite covariance matrix.
Covariance Matrix ! 387
In small samples, estimates of the covariance matrix based on asymptotic theory are often too small
and should be used with caution.
If the nal parameter estimates are subjected to n
act
> 0 linear equality or active linear inequality
constraints, the formulas of the covariance matrices are modied similar to Gallant (1987) and
Cramer (1986, p. 38) and additionally generalized for applications with singular matrices. In the
constrained case, the value of J used in the scalar factor o
2
is dened by
J =
_
max(1. _NOBS_ _DF_ n
act
). for VARDEF=DF
_NOBS_. for VARDEF=N
where n
act
is the number of active constraints and _NOBS_ is set as in the unconstrained case.
For minimization, the covariance matrix should be positive denite; for maximization it should be
negative denite. There are several options available to check for a rank deciency of the covariance
matrix:
v The ASINGULAR=, MSINGULAR=, and VSINGULAR= options can be used to set three
singularity criteria for the inversion of the matrix needed to compute the covariance matrix,
when is either the Hessian or one of the crossproduct Jacobian matrices. The singularity
criterion used for the inversion is
[J
},}
[ _ max(ASING. VSING [
},}
[. MSING max([
1,1
[. . . . . [
n,n
[))
where J
},}
is the diagonal pivot of the matrix , and ASING, VSING and MSING are the
specied values of the ASINGULAR=, VSINGULAR=, and MSINGULAR= options. The
default values are
ASING: the square root of the smallest positive double precision value
MSING: 1E12 if the SINGULAR= option is not specied and max(10 c. 1E 4 SINGULAR)
otherwise, where c is the machine precision
VSING: 1E8 if the SINGULAR= option is not specied and the value of SINGULAR
otherwise
NOTE: In many cases, a normalized matrix D
-1
D
-1
is decomposed and the singularity
criteria are modied correspondingly.
v If the matrix is found singular in the rst step, a generalized inverse is computed. Depending
on the G4= option, a generalized inverse is computed that satises either all four or only two
Moore-Penrose conditions. If the number of parameters n of the application is less than or
equal to G4=i , a G4 inverse is computed; otherwise only a G2 inverse is computed. The G4
inverse is computed by (the computationally very expensive but numerically stable) eigenvalue
decomposition; the G2 inverse is computed by Gauss transformation. The G4 inverse is
computed using the eigenvalue decomposition = 77
T
, where 7 is the orthogonal matrix
of eigenvectors and is the diagonal matrix of eigenvalues, = diag(z
1
. . . . . z
n
). If the
PEIGVAL option is specied, the eigenvalues z
i
are displayed. The G4 inverse of is set to

-
= 7
-
7
T
where the diagonal matrix
-
= diag(z
-
1
. . . . . z
-
n
) is dened using the COVSING= option:
z
-
i
=
_
1,z
i
. if [z
i
[ > COVSING
0. if [z
i
[ _ COVSING
388 ! Chapter 6: The NLP Procedure
If the COVSING= option is not specied, the nr smallest eigenvalues are set to zero, where nr
is the number of rank deciencies found in the rst step.
For optimization techniques that do not use second-order derivatives, the covariance matrix is usually
computed using nite-difference approximations of the derivatives. By specifying TECH=NONE,
any of the covariance matrices can be computed using analytical derivatives. The covariance matrix
specied by the COV= option can be displayed (using the PCOV option) and is written to the
OUTEST= data set.
Input and Output Data Sets
DATA= Input Data Set
The DATA= data set is used only to specify an objective function that is a combination of m other
functions
i
. For each function
i
, i = 1. . . . . m, listed in a MAX, MIN, or LSQ statement, each
observation l, l = 1. . . . . nobs, in the DATA= data set denes a specic function
iI
that is evaluated
by substituting the values of the variables of this observation into the program statements. If the
MAX or MIN statement is used, the m nobs specic functions
iI
are added to a single objective
function . If the LSQ statement is used, the sum-of-squares of the m nobs specic functions

iI
is minimized. The NOMISS option causes observations with missing values to be skipped.
INEST= Input Data Set
The INEST= (or INVAR=, or ESTDATA=) input data set can be used to specify the initial values of
the parameters dened in a DECVAR statement as well as boundary constraints and the more general
linear constraints which could be imposed on these parameters. This form of input is similar to the
dense format input used in PROC LP.
The variables of the INEST= data set are
v a character variable _TYPE_ that indicates the type of the observation
v n numeric variables with the parameter names used in the DECVAR statement
v the BY variables that are used in a DATA= input data set
v a numeric variable _RHS_ specifying the right-hand-side constants (needed only if linear
constraints are used)
v additional variables with names corresponding to constants used in the program statements
The content of the _TYPE_ variable denes the meaning of the observation of the INEST= data set.
PROC NLP recognizes the following _TYPE_ values:
Input and Output Data Sets ! 389
v PARMS, which species initial values for parameters. Additional variables can contain the
values of constants that are referred to in program statements. The values of the constants in
the PARMS observation initialize the constants in the program statements.
v UPPERBD | UB, which species upper bounds. A missing value indicates that no upper bound
is specied for the parameter.
v LOWERBD | LB, which species lower bounds. A missing value indicates that no lower
bound is specied for the parameter.
v LE | <=| <, which species linear constraint

}
a
i}
.
}
_ b
i
. The n parameter values contain
the coefcients a
i}
, and the _RHS_ variable contains the right-hand side b
i
. Missing values
indicate zeros.
v GE | >=| >, which species linear constraint

}
a
i}
.
}
_ b
i
. The n parameter values contain
the coefcients a
i}
, and the _RHS_ variable contains the right-hand side b
i
. Missing values
indicate zeros.
v EQ | =, which species linear constraint

}
a
i}
.
}
= b
i
. The n parameter values contain
the coefcients a
i}
, and the _RHS_ variable contains the right-hand side b
i
. Missing values
indicate zeros.
The constraints specied in an INEST= data set are added to the constraints specied in the BOUNDS
and LINCON statements. You can use an OUTEST= data set as an INEST= data set in a subsequent
run of PROC NLP. However, be aware that the OUTEST= data set also contains the boundary and
general linear constraints specied in the previous run of PROC NLP. When you are using this
OUTEST= data set without changes as an INEST= data set, PROC NLP adds the constraints from
the data set to the constraints specied by a BOUNDS and LINCON statement. Although PROC
NLP automatically eliminates multiple identical constraints you should avoid specifying the same
constraint twice.
INQUAD= Input Data Set
Two types of INQUAD= data sets can be used to specify the objective function of a quadratic
programming problem for TECH=QUADAS or TECH=LICOMP,
(.) =
1
2
.
T
G. g
T
. c. with G
T
= G
The dense INQUAD= data set must contain all numerical values of the symmetric matrix G, the
vector g, and the scalar c. Using the sparse INQUAD= data set allows you to specify only the
nonzero positions in matrix G and vector g. Those locations that are not set by the sparse INQUAD=
data set are assumed to be zero.
Dense INQUAD= Data Set
A dense INQUAD= data set must contain two character variables, _TYPE_ and _NAME_, and at
least n numeric variables whose names are the parameter names. The _TYPE_ variable takes the
following values:
390 ! Chapter 6: The NLP Procedure
v QUAD lists the n values of the row of the G matrix that is dened by the parameter name used
in the _NAME_ variable.
v LINEAR lists the n values of the g vector.
v CONST sets the value of the scalar c and cannot contain different numerical values; however,
it could contain up to n 1 missing values.
v PARMS species initial values for parameters.
v UPPERBD | UB species upper bounds. A missing value indicates that no upper bound is
specied.
v LOWERBD | LB species lower bounds. A missing value indicates that no lower bound is
specied.
v LE | <= | < species linear constraint

}
a
i}
.
}
_ b
i
. The n parameter values contain the
coefcients a
i}
, and the _RHS_ variable contains the right-hand side b
i
. Missing values
indicate zeros.
v GE | >= | > species linear constraint

}
a
i}
.
}
_ b
i
. The n parameter values contain
the coefcients a
i}
, and the _RHS_ variable contains the right-hand side b
i
. Missing values
indicate zeros.
v EQ | = species linear constraint

}
a
i}
.
}
= b
i
. The n parameter values contain the
coefcients a
i}
, and the _RHS_ variable contains the right-hand side b
i
. Missing values
indicate zeros.
Constraints specied in a dense INQUAD= data set are added to the constraints specied in BOUNDS
and LINCON statements.
Sparse INQUAD= Data Set
A sparse INQUAD= data set must contain three character variables _TYPE_, _ROW_, and _COL_,
and one numeric variable _VALUE_. The _TYPE_ variable can assume two values:
v QUAD species that the _ROW_ and _COL_ variables dene the row and column locations of
the values in the G matrix.
v LINEAR species that the _ROW_ variable denes the row locations of the values in the g
vector. The _COL_ variable is not used.
Using both the MODEL= option and the INCLUDE statement with the same model le will include
the le twice (erroneous in most cases).
Input and Output Data Sets ! 391
OUT= Output Data Set
The OUT= data set contains those variables of a DATA= input data set that are referred to in the
program statements and additional variables computed by the program statements for the objective
function. Specifying the NOMISS option enables you to skip observations with missing values in
variables used in the program statements. The OUT= data set can also contain rst- and second-order
derivatives of these variables if the OUTDER= option is specied. The variables and derivatives are
the nal parameter estimates .
+
or (for TECH=NONE) the initial value .
0
.
The variables of the OUT= data set are
v the BY variables and all other variables that are used in a DATA= input data set and referred to
in the program code
v a variable _OBS_ containing the number of observations read from a DATA= input data set,
where the counting is restarted with the start of each BY group. If there is no DATA= input
data set, then _OBS_=1.
v a character variable _TYPE_ naming the type of the observation
v the parameter variables listed in the DECVAR statement
v the function variables listed in the MIN, MAX, or LSQ statement
v all other variables computed in the program statements
v the character variable _WRT_ (if OUTDER=1) containing the with respect to variable for
which the rst-order derivatives are written in the function variables
v the two character variables _WRT1_ and _WRT2_ (if OUTDER=2) containing the two with
respect to variables for which the rst- and second-order derivatives are written in the function
variables
OUTEST= Output Data Set
The OUTEST= or OUTVAR= output data set saves the optimization solution of PROC NLP. You
can use the OUTEST= or OUTVAR= data set as follows:
v to save the values of the objective function on grid points to examine, for example, surface
plots using PROC G3D (use the OUTGRID option)
v to avoid any costly computation of analytical (rst- or second-order) derivatives during
optimization when they are needed only upon termination. In this case a two-step approach is
recommended:
1. In a rst execution, the optimization is done; that is, optimal parameter estimates are
computed, and the results are saved in an OUTEST= data set.
2. In a subsequent execution, the optimal parameter estimates in the previous OUTEST=
data set are read in an INEST= data set and used with TECH=NONE to compute further
results, such as analytical second-order derivatives or some kind of covariance matrix.
392 ! Chapter 6: The NLP Procedure
v to restart the procedure using parameter estimates as initial values
v to split a time-consuming optimization problem into a series of smaller problems using
intermediate results as initial values in subsequent runs. (Refer to the MAXTIME=, MAXIT=,
and MAXFUNC= options to trigger stopping.)
v to write the value of the objective function, the parameter estimates, the time in seconds starting
at the beginning of the optimization process and (if available) the gradient to the OUTEST=
data set during the iterations. After the PROC NLP run is completed, the convergence progress
can be inspected by graphically displaying the iterative information. (Refer to the OUTITER
option.)
The variables of the OUTEST= data set are
v the BY variables that are used in a DATA= input data set
v a character variable _TECH_ naming the optimization technique used
v a character variable _TYPE_ specifying the type of the observation
v a character variable _NAME_ naming the observation. For a linear constraint, the _NAME_
variable indicates whether the constraint is active at the solution. For the initial observations,
the _NAME_ variable indicates if the number in the _RHS_ variable corresponds to the number
of positive, negative, or zero eigenvalues.
v n numeric variables with the parameter names used in the DECVAR statement. These variables
contain a point . of the parameter space, lower or upper bound constraints, or the coefcients
of linear constraints.
v a numeric variable _RHS_ (right-hand side) that is used for the right-hand-side value b
i
of
a linear constraint or for the value = (.) of the objective function at a point . of the
parameter space
v a numeric variable _ITER_ that is zero for initial values, equal to the iteration number for the
OUTITER output, and missing for the result output
The _TYPE_ variable identies how to interpret the observation. If _TYPE_ is
v PARMS then parameter-named variables contain the coordinates of the resulting point .
+
. The
_RHS_ variable contains (.
+
).
v INITIAL then parameter-named variables contain the feasible starting point .
(0)
. The _RHS_
variable contains (.
(0)
).
v GRIDPNT then (if the OUTGRID option is specied) parameter-named variables contain the
coordinates of any point .
(k)
used in the grid search. The _RHS_ variable contains (.
(k)
).
v GRAD then parameter-named variables contain the gradient at the initial or nal estimates.
v STDERR then parameter-named variables contain the approximate standard errors (square
roots of the diagonal elements of the covariance matrix) if the COV= option is specied.
Input and Output Data Sets ! 393
v _NOBS_ then (if the COV= option is specied) all parameter variables contain the value of
_NOBS_ used in computing the o
2
value in the formula of the covariance matrix.
v UPPERBD | UB then (if there are boundary constraints) the parameter variables contain the
upper bounds.
v LOWERBD | LB then (if there are boundary constraints) the parameter variables contain the
lower bounds.
v NACTBC then all parameter variables contain the number n
obc
of active boundary constraints
at the solution .
+
.
v ACTBC then (if there are active boundary constraints) the observation indicate which parame-
ters are actively constrained, as follows:
_NAME_=GE the active lower bounds
_NAME_=LE the active upper bounds
_NAME_=EQ the active equality constraints
v NACTLC then all parameter variables contain the number n
oIc
of active linear constraints that
are recognized as linearly independent.
v NLDACTLC then all parameter variables contain the number of active linear constraints that
are recognized as linearly dependent.
v LE then (if there are linear constraints) the observation contains the i th linear constraint

}
a
i}
.
}
_ b
i
. The parameter variables contain the coefcients a
i}
, = 1. . . . . n,
and the _RHS_ variable contains b
i
. If the constraint i is active at the solution .
+
, then
_NAME_=ACTLC or _NAME_=LDACTLC.
v GE then (if there are linear constraints) the observation contains the i th linear constraint

}
a
i}
.
}
_ b
i
. The parameter variables contain the coefcients a
i}
, = 1. . . . . n,
and the _RHS_ variable contains b
i
. If the constraint i is active at the solution .
+
, then
_NAME_=ACTLC or _NAME_=LDACTLC.
v EQ then (if there are linear constraints) the observation contains the i th linear constraint

}
a
i}
.
}
= b
i
. The parameter variables contain the coefcients a
i}
, = 1. . . . . n, the
_RHS_ variable contains b
i
, and _NAME_=ACTLC or _NAME_=LDACTLC.
v LAGRANGE then (if at least one of the linear constraints is an equality constraint or an
active inequality constraint) the observation contains the vector of Lagrange multipliers. The
Lagrange multipliers of active boundary constraints are listed rst followed by those of active
linear constraints and those of active nonlinear constraints. Lagrange multipliers are available
only for the set of linearly independent active constraints.
v PROJGRAD then (if there are linear constraints) the observation contains the n n
act
values
of the projected gradient g
7
= 7
T
g in the variables corresponding to the rst n n
act
parameters.
v JACOBIAN then (if the PJACOBI or OUTJAC option is specied) the m observations contain
the m rows of the m n Jacobian matrix. The _RHS_ variable contains the row number l,
l = 1. . . . . m.
394 ! Chapter 6: The NLP Procedure
v HESSIAN then the rst n observations contain the n rows of the (symmetric) Hessian matrix.
The _RHS_ variable contains the row number , = 1. . . . . n, and the _NAME_ variable
contains the corresponding parameter name.
v PROJHESS then the rst n n
act
observations contain the n n
act
rows of the projected
Hessian matrix 7
T
G7. The _RHS_ variable contains the row number , = 1. . . . . n n
act
,
and the _NAME_ variable is blank.
v CRPJAC then the rst n observations contain the n rows of the (symmetric) crossproduct
Jacobian matrix at the solution. The _RHS_ variable contains the row number , = 1. . . . . n,
and the _NAME_ variable contains the corresponding parameter name.
v PROJCRPJ then the rst n n
act
observations contain the n n
act
rows of the projected
crossproduct Jacobian matrix 7
T
(J
T
J)7. The _RHS_ variable contains the row number ,
= 1. . . . . n n
oct
, and the _NAME_ variable is blank.
v COV1, COV2, COV3, COV4, COV5, or COV6 then (depending on the COV= option) the
rst n observations contain the n rows of the (symmetric) covariance matrix of the parameter
estimates. The _RHS_ variable contains the row number , = 1. . . . . n, and the _NAME_
variable contains the corresponding parameter name.
v DETERMIN contains the determinant det = a 10
b
of the matrix specied by the value of
the _NAME_ variable where a is the value of the rst variable in the DECVAR statement and
b is in _RHS_.
v NEIGPOS, NEIGNEG, or NEIGZER then the _RHS_ variable contains the number of positive,
negative, or zero eigenvalues of the matrix specied by the value of the _NAME_ variable.
v COVRANK then the _RHS_ variable contains the rank of the covariance matrix.
v SIGSQ then the _RHS_ variable contains the scalar factor of the covariance matrix.
v _TIME_ then (if the OUTITER option is specied) the _RHS_ variable contains the number
of seconds passed since the start of the optimization.
v TERMINAT then if optimization terminated at a point satisfying one of the termination criteria,
an abbreviation of the corresponding criteria is given to the _NAME_ variable. Otherwise
_NAME_=PROBLEMS.
If for some reason the procedure does not terminate successfully (for example, no feasible initial
values can be computed or the function value or derivatives at the starting point cannot be computed),
the OUTEST= data set may contain only part of the observations (usually only the PARMS and
GRAD observation).
NOTE: Generally you can use an OUTEST= data set as an INEST= data set in a further run of PROC
NLP. However, be aware that the OUTEST= data set also contains the boundary and general linear
constraints specied in the previous run of PROC NLP. When you are using this OUTEST= data set
without changes as an INEST= data set, PROC NLP adds the constraints from the data set to the
constraints specied by a BOUNDS or LINCON statement. Although PROC NLP automatically
eliminates multiple identical constraints you should avoid specifying the same constraint twice.
Input and Output Data Sets ! 395
Output of Proles
The following observations are written to the OUTEST= data set only when the PROFILE statement
or CLPARM option is specied.
Table 6.4 Output of Proles
_TYPE_ _NAME_ _RHS_ Meaning of Observation
PLC_LOW parname , value coordinates of lower CL for
PLC_UPP parname , value coordinates of upper CL for
WALD_CL LOWER , value lower Wald CL for in _ALPHA_
WALD_CL UPPER , value upper Wald CL for in _ALPHA_
PL_CL LOWER , value lower PL CL for in _ALPHA_
PL_CL UPPER , value upper PL CL for in _ALPHA_
PROFILE L(THETA) missing , value corresponding to .
in following _NAME_=THETA
PROFILE THETA missing . value corresponding to ,
in previous _NAME_=L(THETA)
Assume that the PROFILE statement species n
]
parameters and n

condence levels. For CLPARM,

n
]
= n and n

= 4.
v _TYPE_=PLC_LOW and _TYPE_=PLC_UPP:
If the CLPARM= option or the PROFILE statement with the OUTTABLE option is specied,
then the complete set 0 of parameter estimates (rather than only the condence limit . = 0
}
)
is written to the OUTEST= data set for each side of the condence interval. This output may
be helpful for further analyses on how small changes in . = 0
}
affect the changes in the other
0
i
. i = . The _ALPHA_ variable contains the corresponding value of . There should be no
more than 2n

n
]
observations. If the condence limit cannot be computed, the corresponding
observation is not available.
v _TYPE_=WALD_CL:
If CLPARM=WALD, CLPARM=BOTH, or the PROFILE statement with values is specied,
then the Wald condence limits are written to the OUTEST= data set for each of the default or
specied values of . The _ALPHA_ variable contains the corresponding value of . There
should be 2n

observations.
v _TYPE_=PL_CL:
If CLPARM=PL, CLPARM=BOTH, or the PROFILE statement with values is specied,
then the PL condence limits are written to the OUTEST= data set for each of the default or
specied values of . The _ALPHA_ variable contains the corresponding values of . There
should be 2n

observations; some observations may have missing values.

v _TYPE_=PROFILE:
If CLPARM=PL, CLPARM=BOTH, or the CLPARM= statement with or without val-
ues is specied, then a set of (.. ,) point coordinates in two adjacent observations with
_NAME_=L(THETA) (, value) and _NAME_=THETA (. value) is written to the OUTEST=
396 ! Chapter 6: The NLP Procedure
data set. The _RHS_ and _ALPHA_ variables are not used (are set to missing). The number
of observations depends on the difculty of the optimization problems.
OUTMODEL= Output Data Set
The program statements for objective functions, nonlinear constraints, and derivatives can be saved
into an OUTMODEL= output data set. This data set can be used in an INCLUDE program statement
or as a MODEL= input data set in subsequent calls of PROC NLP. The OUTMODEL= option is
similar to the option used in PROC MODEL in SAS/ETS software.
Storing Programs in Model Files
Models can be saved to and recalled from SAS catalog les. SAS catalogs are special les which
can store many kinds of data structures as separate units in one SAS le. Each separate unit is called
an entry, and each entry has an entry type that identies its structure to the SAS system.
In general, to save a model, use the OUTMODEL=name option in the PROC NLP statement, where
name is specied as libref.catalog.entry, libref.entry, or entry. The libref, catalog, and entry names
must be valid SAS names no more than 8 characters long. The catalog name is restricted to 7
characters on the CMS operating system. If not given, the catalog name defaults to MODELS, and
the libref defaults to WORK. The entry type is always MODEL. Thus, OUTMODEL=X writes the
model to the le WORK.MODELS.X.MODEL.
The MODEL= option is used to read in a model. A list of model les can be specied in the MODEL=
option, and a range of names with numeric sufxes can be given, as in MODEL=(MODEL1-
MODEL10). When more than one model le is given, the list must be placed in parentheses, as in
MODEL=(A B C). If more than one model le is specied, the les are combined in the order listed
in the MODEL= option.
When the MODEL= option is specied in the PROC NLP statement and model denition statements
are also given later in the PROC NLP step, the model les are read in rst, in the order listed, and
the model program specied in the PROC NLP step is appended after the model program read from
the MODEL= les.
The INCLUDE statement can be used to append model code to the current model code. The contents
of the model les are inserted into the current model at the position where the INCLUDE statement
appears.
Note that the following statements are not part of the programcode that is written to an OUTMODEL=
data set: MIN, MAX, LSQ, MINQUAD, MAXQUAD, DECVAR, BOUNDS, BY, CRPJAC, GRA-
DIENT, HESSIAN, JACNLC, JACOBIAN, LABEL, LINCON, MATRIX, and NLINCON.
Displayed Output ! 397
Displayed Output
Procedure Initialization
After the procedure has processed the problem, it displays summary information about the problem
and the options that you have selected. It may also display a list of linearly dependent constraints
and other information about the constraints and parameters.
Optimization Start
At the start of optimization the procedure displays
v the number of constraints that are active at the starting point, or more precisely, the number of
constraints that are currently members of the working set. If this number is followed by a plus
sign, there are more active constraints, of which at least one is temporarily released from the
working set due to negative Lagrange multipliers.
v the value of the objective function at the starting point
v if the (projected) gradient is available, the value of the largest absolute (projected) gradient
element
v for the TRUREG and LEVMAR subroutines, the initial radius of the trust region around the
starting point
Iteration History
In general, the iteration history consists of one line of output containing the most important informa-
tion for each iteration. The iteration-extensive Nelder-Mead simplex method, however, displays only
one line for several internal iterations. This technique skips the output for some iterations because
v some of the termination tests (size and standard deviation) are rather time-consuming compared
to the simplex operations and are done once every ve simplex operations
v the resulting history output is smaller
The _LIST_ variable (refer to the section Program Statements on page 355) also enables you to
display the parameter estimates .
(k)
and the gradient g
(k)
in all or some selected iterations k.
The iteration history always includes the following (the words in parentheses indicate the column
header output):
v the iteration number (iter)
v the number of iteration restarts (nrest)
v the number of function calls (nfun)
398 ! Chapter 6: The NLP Procedure
v the number of active constraints (act)
v the value of the optimization criterion (optcrit)
v the difference between adjacent function values (difcrit)
v the maximum of the absolute (projected) gradient components (maxgrad)
An apostrophe trailing the number of active constraints indicates that at least one of the active
constraints was released from the active set due to a signicant Lagrange multiplier.
The optimization history is displayed by default because it is important to check for possible
convergence problems.
Optimization Termination
The output of the optimization history ends with a short output of information concerning the
optimization result:
v the number of constraints that are active at the nal point, or more precisely, the number of
constraints that are currently members of the working set. When this number is followed by a
plus sign, it indicates that there are more active constraints of which at least one is temporarily
released from the working set due to negative Lagrange multipliers.
v the value of the objective function at the nal point
v if the (projected) gradient is available, the value of the largest absolute (projected) gradient
element
v other information that is specic for the optimization technique
The NOPRINT option suppresses all output to the list le and only errors, warnings, and notes are
displayed to the log le. The PALL option sets a large group of some of the commonly used specic
displaying options, the PSHORT option suppresses some, and the PSUMMARY option suppresses
almost all of the default output. The following table summarizes the correspondence between the
general and the specic print options.
Missing Values ! 399
Table 6.5 Optimization Termination
Output Options PALL default PSHORT PSUMMARY Output
y y y y Summary of optimization
y y y n Parameter estimates
y y y n Gradient of objective func
PHISTORY y y y n Iteration history
PINIT y y n n Setting of initial values
y y n n Listing of constraints
PGRID y n n n Results of grid search
PNLCJAC y n n n Jacobian nonlin. constr.
PFUNCTION y n n n Values of functions
PEIGVAL y n n n Eigenvalue distribution
PCRPJAC y n n n Crossproduct Jacobian
PHESSIAN y n n n Hessian matrix
PSTDERR y n n n Approx. standard errors
PCOV y n n n Covariance matrices
PJACOBI n n n n Jacobian
LIST n n n n Model program, variables
LISTCODE n n n n Compiled model program
Convergence Status
Upon termination, the NLP procedure creates an ODS table called ConvergenceStatus. You can
use this name to reference the table when using the Output Delivery System (ODS) to select tables
and create output data sets. Within the ConvergenceStatus table there are two variables, Status
and Reason, which contain the status of the optimization run. For the Status variable, a value of
zero indicates that one of the convergence criteria is satised; a nonzero value indicates otherwise.
In all cases, an explicit interpretation of the status code is displayed as a string stored in the Reason
variable. For more information about ODS, see SAS Output Delivery System: Users Guide.
Missing Values
Missing Values in Program Statements
There is one very important reason for using missing values in program statements specifying the
values of the objective functions and derivatives: it may not be possible to evaluate the program
statements for a particular point .. For example, the extrapolation formula of one of the line-search
algorithms may generate large . values for which the EXP function cannot be evaluated without
oating point overow. The compiler of the program statements may check for such situations
automatically, but it would be safer if you check the feasibility of your program statements. In some
cases, the specication of boundary or linear constraints for parameters can avoid such situations. In
many other cases, you can indicate that . is a bad point simply by returning a missing value for the
400 ! Chapter 6: The NLP Procedure
objective function. In such cases the optimization algorithms in PROC NLP shorten the step length
or reduce the trust region radius so that the next point will be closer to the point that was already
successfully evaluated at the last iteration. Note that the starting point .
(0)
must be a point for which
the program statements can be evaluated.
Missing Values in Input Data Sets
Observations with missing values in the DATA= data set for variables used in the objective function
can lead to a missing value of the objective function implying that the corresponding BY group of
data is not processed. The NOMISS option can be used to skip those observations of the DATA= data
set for which relevant variables have missing values. Relevant variables are those that are referred to
in program statements.
There can be different reasons to include observations with missing values in the INEST= data set.
The value of the _RHS_ variable is not used in some cases and can be missing. Missing values for
the variables corresponding to parameters in the _TYPE_ variable are as follows:
v PARMS observations cause those parameters to have initial values assigned by the DECVAR
statement or by the RANDOM= or INITIAL= option.
v UPPERBD or LOWERBD observations cause those parameters to be unconstrained by upper
or lower bounds.
v LE, GE, or EQ observations cause those parameters to have zero values in the constraint.
In general, missing values are treated as zeros.
Computational Resources
Since nonlinear optimization is an iterative process that depends on many factors, it is difcult to
estimate how much computer time is necessary to compute an optimal solution satisfying one of
the termination criteria. The MAXTIME=, MAXITER=, and MAXFUNC= options can be used to
restrict the amount of real time, the number of iterations, and the number of function calls in a single
run of PROC NLP.
In each iteration k, the NRRIDG and LEVMAR techniques use symmetric Householder transforma-
tions to decompose the n n Hessian (crossproduct Jacobian) matrix G,
G = V
T
T V . V orthogonal. T tridiagonal
to compute the (Newton) search direction s:
s
(k)
= G
(k-1)
g
(k)
. k = 1. 2. 3. . . .
The QUADAS, TRUREG, NEWRAP, and HYQUAN techniques use the Cholesky decomposition
to solve the same linear system while computing the search direction. The QUANEW, DBLDOG,
Computational Resources ! 401
CONGRA, and NMSIMP techniques do not need to invert or decompose a Hessian or crossproduct
Jacobian matrix and thus require fewer computational resources then the rst group of techniques.
The larger the problem, the more time is spent computing function values and derivatives. Therefore,
many researchers compare optimization techniques by counting and comparing the respective
numbers of function, gradient, and Hessian (crossproduct Jacobian) evaluations. You can save
computer time and memory by specifying derivatives (using the GRADIENT, JACOBIAN, CRPJAC,
or HESSIAN statement) since you will typically produce a more efcient representation than the
internal derivative compiler.
Finite-difference approximations of the derivatives are expensive since they require additional
function or gradient calls.
v Forward-difference formulas:
First-order derivatives: n additional function calls are needed.
Second-order derivatives based on function calls only: for a dense Hessian, n(n 3),2
additional function calls are needed.
Second-order derivatives based on gradient calls: n additional gradient calls are needed.
v Central-difference formulas:
First-order derivatives: 2n additional function calls are needed.
Second-order derivatives based on function calls only: for a dense Hessian, 2n(n 1)
additional function calls are needed.
Second-order derivatives based on gradient: 2n additional gradient calls are needed.
Many applications need considerably more time for computing second-order derivatives (Hessian
matrix) than for rst-order derivatives (gradient). In such cases, a (dual) quasi-Newton or conjugate
gradient technique is recommended, which does not require second-order derivatives.
The following table shows for each optimization technique which derivatives are needed (FOD:
rst-order derivatives; SOD: second-order derivatives), what kinds of constraints are supported (BC:
boundary constraints; LIC: linear constraints), and the minimal memory (number of double oating
point numbers) required. For various reasons, there are additionally about 7n m double oating
point numbers needed.
402 ! Chapter 6: The NLP Procedure
Quadratic Programming FOD SOD BC LIC Memory
LICOMP - - x x 18n 3nn
QUADAS - - x x 1n 2nn,2
General Optimization FOD SOD BC LIC Memory
TRUREG x x x x 4n 2nn,2
NEWRAP x x x x 2n 2nn,2
NRRIDG x x x x 6n nn,2
QUANEW x - x x 1n nn,2
DBLDOG x - x x 7n nn,2
CONGRA x - x x 3n
NMSIMP - - x x 4n nn
Least Squares FOD SOD BC LIC Memory
LEVMAR x - x x 6n nn,2
HYQUAN x - x x 2n nn,2 3m
Notes:
v Here, n denotes the number of parameters, nn the squared number of parameters, and nn,2 :=
n(n 1),2.
v The value of m is the product of the number of functions specied in the MIN, MAX, or LSQ
statement and the maximum number of observations in each BY group of a DATA= input data
set. The following table also contains the number of variables in the DATA= data set that are
used in the program statements.
v For a diagonal Hessian matrix, the nn,2 term in QUADAS, TRUREG, NEWRAP, and
NRRIDG is replaced by n.
v If the TRUREG, NRRIDG, or NEWRAP method is used to minimize a least squares problem,
the second derivatives are replaced by the crossproduct Jacobian matrix.
v The memory needed by the TECH=NONE specication depends on the output specications
(typically, it needs 3n nn,2 double oating point numbers and an additional mn if the
Jacobian matrix is required).
The total amount of memory needed to run an optimization technique consists of the technique-
specic memory listed in the preceding table, plus additional blocks of memory as shown in the
following table.
Memory Limit ! 403
double int long 8byte
Basic Requirement 7n m n 3n n m
DATA= data set - -
JACOBIAN statement m(n 2) - - -
CRPJAC statement nn,2 - - -
HESSIAN statement nn,2 - - -
COV= option (2+)nn,2 n - - -
Scaling vector n - - -
BOUNDS statement 2n n - -
Bounds in INEST= 2n - - -
LINCON and TRUREG c(n 1) nn nn,2 4n 3c - -
LINCON and other c(n 1) nn 2nn,2 4n 3c - -
Notes:
v For TECH=LICOMP, the total amount of memory needed for the linear or boundary con-
strained case is 18(n c) 3(n c)(n c), where c is the number of constraints.
v The amount of memory needed to specify derivatives with a GRADIENT, JACOBIAN,
CRPJAC, or HESSIAN statement (shown in this table) is small compared to that needed for
using the internal function compiler to compute the derivatives. This is especially so for
second-order derivatives.
v If the CONGRA technique is used, specifying the GRADCHECK=DETAIL option requires
an additional nn,2 double oating point numbers to store the nite-difference Hessian matrix.
Memory Limit
The system option MEMSIZE sets a limit on the amount of memory used by the SAS System. If
you do not specify a value for this option, then the SAS System sets a default memory limit. Your
operating environment determines the actual size of the default memory limit, which is sufcient for
many applications. However, to solve most realistic optimization problems, the NLP procedure might
require more memory. Increasing the memory limit can reduce the chance of an out-of-memory
condition.
NOTE: The MEMSIZE system option is not available in some operating environments. See the
documentation for your operating environment for more information.
You can specify -MEMSIZE 0 to indicate all available memory should be used, but this setting
should be used with caution. In most operating environments, it is better to specify an adequate
amount of memory than to specify -MEMSIZE 0. For example, if you are running PROC OPTLP
to solve LP problems with only a few hundred thousand variables and constraints, -MEMSIZE
500M might be sufcient to enable the procedure to run without an out-of-memory condition. When
problems have millions of variables, -MEMSIZE 1000M or higher might be needed. These are
rules of thumbproblems with atypical structure, density, or other characteristics can increase the
optimizers memory requirements.
404 ! Chapter 6: The NLP Procedure
The MEMSIZE option can be specied at system invocation, on the SAS command line, or in a
conguration le. The syntax is described in the SAS Companion for your operating environment.
To report a procedures memory consumption, you can use the FULLSTIMER option. The syntax is
described in the SAS Companion for your operating environment.
Examples: NLP Procedure
Example 6.1: Using the DATA= Option
This example illustrates the use of the DATA= option. The Bard function (refer to Mor, Garbow,
and Hillstrom (1981)) is a least squares problem with n = 3 parameters and m = 15 functions
k
:
(.) =
1
2
15

k=1

2
k
(.). . = (.
1
. .
2
. .
3
)
where

k
(.) = ,
k

_
.
1

u
k

k
.
2
n
k
.
3
_
with u
k
= k,
k
= 16 k, n
k
= min(u
k
.
k
), and
, = (.14. .18. .22. .25. .29. .32. .35. .39. .37. .58. .73. .96. 1.34. 2.10. 4.39)
The minimum function value (.
+
) = 4.107E3 is at the point (0.08. 1.13. 2.34). The starting
point .
0
= (1. 1. 1) is used.
The following is the naive way of specifying the objective function.
proc nlp tech=levmar;
lsq y1-y15;
parms x1-x3 = 1;
tmp1 = 15
*
x2 + min(1,15)
*
x3;
y1 = 0.14 - (x1 + 1 / tmp1);
tmp1 = 14
*
x2 + min(2,14)
*
x3;
y2 = 0.18 - (x1 + 2 / tmp1);
tmp1 = 13
*
x2 + min(3,13)
*
x3;
y3 = 0.22 - (x1 + 3 / tmp1);
tmp1 = 12
*
x2 + min(4,12)
*
x3;
y4 = 0.25 - (x1 + 4 / tmp1);
tmp1 = 11
*
x2 + min(5,11)
*
x3;
y5 = 0.29 - (x1 + 5 / tmp1);
tmp1 = 10
*
x2 + min(6,10)
*
x3;
y6 = 0.32 - (x1 + 6 / tmp1);
tmp1 = 9
*
x2 + min(7,9)
*
x3;
y7 = 0.35 - (x1 + 7 / tmp1);
Example 6.1: Using the DATA= Option ! 405
tmp1 = 8
*
x2 + min(8,8)
*
x3;
y8 = 0.39 - (x1 + 8 / tmp1);
tmp1 = 7
*
x2 + min(9,7)
*
x3;
y9 = 0.37 - (x1 + 9 / tmp1);
tmp1 = 6
*
x2 + min(10,6)
*
x3;
y10 = 0.58 - (x1 + 10 / tmp1);
tmp1 = 5
*
x2 + min(11,5)
*
x3;
y11 = 0.73 - (x1 + 11 / tmp1);
tmp1 = 4
*
x2 + min(12,4)
*
x3;
y12 = 0.96 - (x1 + 12 / tmp1);
tmp1 = 3
*
x2 + min(13,3)
*
x3;
y13 = 1.34 - (x1 + 13 / tmp1);
tmp1 = 2
*
x2 + min(14,2)
*
x3;
y14 = 2.10 - (x1 + 14 / tmp1);
tmp1 = 1
*
x2 + min(15,1)
*
x3;
y15 = 4.39 - (x1 + 15 / tmp1);
run;
A more economical way to program this problem uses the DATA= option to input the 15 terms in
(.).
data bard;
input r @@;
w1 = 16. - _n_;
w2 = min(_n_ , 16. - _n_);
datalines;
.14 .18 .22 .25 .29 .32 .35 .39
.37 .58 .73 .96 1.34 2.10 4.39
;
proc nlp data=bard tech=levmar;
lsq y;
parms x1-x3 = 1.;
y = r - (x1 + _obs_ / (w1
*
x2 + w2
*
x3));
run;
Another way you can specify the objective function uses the ARRAY statement and an explicit do
loop, as in the following code.
proc nlp tech=levmar;
array r[15] .14 .18 .22 .25 .29 .32 .35 .39 .37 .58
.73 .96 1.34 2.10 4.39 ;
array y[15] y1-y15;
lsq y1-y15;
parms x1-x3 = 1.;
do i = 1 to 15;
w1 = 16. - i;
w2 = min(i , w1);
w3 = w1
*
x2 + w2
*
x3;
y[i] = r[i] - (x1 + i / w3);
end;
run;
406 ! Chapter 6: The NLP Procedure
Example 6.2: Using the INQUAD= Option
This example illustrates the INQUAD= option for specifying a quadratic programming problem:
min (.) =
1
2
.
T
G. g
T
. c. with G
T
= G
Suppose that c = 100, G = diag(.4. 4) and 2 _ .
1
_ 50, 50 _ .
2
_ 50, and 10 _ 10.
1
.
2
.
You specify the constant c and the Hessian G in the data set QUAD1. Notice that the _TYPE_
variable contains the keywords that identify how the procedure should interpret the observations.
data quad1;
input _type_ $ _name_ $ x1 x2;
datalines;
const . -100 -100
quad x1 0.4 0
quad x2 0 4
;
You specify the QUAD1 data set with the INQUAD= option. Notice that the names of the variables
in the QUAD1 data set and the _NAME_ variable match the names of the parameters in the PARMS
statement.
proc nlp inquad=quad1 all;
min ;
parms x1 x2 = -1;
bounds 2 <= x1 <= 50,
-50 <= x2 <= 50;
lincon 10 <= 10
*
x1 - x2;
run;
Alternatively, you can use a sparse format for specifying the G matrix, eliminating the zeros. You
use the special variables _ROW_, _COL_, and _VALUE_ to give the nonzero row and column names
and value.
data quad2;
input _type_ $ _row_ $ _col_ $ _value_;
datalines;
const . . -100
quad x1 x1 0.4
quad x2 x2 4
;
You can also include the constraints in the QUAD data set. Notice how the _TYPE_ variable contains
keywords that identify how the procedure is to interpret the values in each observation.
data quad3;
input _type_ $ _name_ $ x1 x2 _rhs_;
datalines;
const . -100 -100 .
quad x1 0.02 0 .
quad x2 0.00 2 .
Example 6.3: Using the INEST=Option ! 407
parms . -1 -1 .
lowerbd . 2 -50 .
upperbd . 50 50 .
ge . 10 -1 10
;
proc nlp inquad=quad3;
min ;
parms x1 x2;
run;
Example 6.3: Using the INEST=Option
This example illustrates the use of the INEST= option for specifying a starting point and linear
constraints. You name a data set with the INEST= option. The format of this data set is similar to the
format of the QUAD data set described in the previous example.
Consider the Hock and Schittkowski (1981) Problem # 24:
min (.) =
((.
1
3)
2
9).
3
2
27
_
3
subject to:
0 _ .
1
. .
2
0 _ .57735.
1
.
2
0 _ .
1
1.732.
2
6 _ .
1
1.732.
2
with minimum function value (.
+
) = 1 at .
+
= (3.
_
3). The feasible starting point is
.
0
= (1. .5).
You can specify this model in PROC NLP as follows:
proc nlp tech=trureg outest=res;
min y;
parms x1 = 1,
x2 = .5;
bounds 0 <= x1-x2;
lincon .57735
*
x1 - x2 >= 0,
x1 + 1.732
*
x2 >= 0,
-x1 - 1.732
*
x2 >= -6;
y = (((x1 - 3)
**
2 - 9.)
*
x2
**
3) / (27
*
sqrt(3));
run;
Note that none of the data for this model are in a data set. Alternatively, you can save the starting point
(1. .5) and the linear constraints in a data set. Notice that the _TYPE_ variable contains keywords
408 ! Chapter 6: The NLP Procedure
that identify how the procedure is to interpret each of the observations and that the parameters in
the problems X1 and X2 are variables in the data set. The observation with _TYPE_=LOWERBD
gives the lower bounds on the parameters. The observation with _TYPE_=GE gives the coefcients
for the rst constraint. Similarly, the subsequent observations contain specications for the other
constraints. Also notice that the special variable _RHS_ contains the right-hand-side values.
data betts1(type=est);
input _type_ $ x1 x2 _rhs_;
datalines;
parms 1 .5 .
lowerbd 0 0 .
ge .57735 -1 .
ge 1 1.732 .
le 1 1.732 6
;
Now you can solve this problem with the following code. Notice that you specify the objective
function and the parameters.
proc nlp inest=betts1 tech=trureg;
min y;
parms x1 x2;
y = (((x1 - 3)
**
2 - 9)
*
x2
**
3) / (27
*
sqrt(3));
run;
You can even include any constants used in the program statements in the INEST= data set. In the
following code the variables A, B, C, and D contain some of the constants used in calculating the
objective function Y.
data betts2(type=est);
input _type_ $ x1 x2 _rhs_ a b c d;
datalines;
parms 1 .5 . 3 9 27 3
lowerbd 0 0 . . . . .
ge .57735 -1 0 . . . .
ge 1 1.732 0 . . . .
le 1 1.732 6 . . . .
;
Notice that in the program statement for calculating Y, the constants are replaced by the A, B, C, and
D variables.
proc nlp inest=betts2 tech=trureg;
min y;
parms x1 x2;
y = (((x1 - a)
**
2 - b)
*
x2
**
3) / (c
*
sqrt(d));
run;
Example 6.4: Restarting an Optimization ! 409
Example 6.4: Restarting an Optimization
This example shows how you can restart an optimization problem using the OUTEST=, INEST=,
OUTMODEL=, and MODEL= options and how to save output into an OUT= data set. The least
squares solution of the Rosenbrock function using the trust region method is used.
The following code solves the problem and saves the model in the MODEL data set and the solution
in the EST and OUT1 data sets.
proc nlp tech=trureg outmodel=model outest=est out=out1;
lsq y1 y2;
parms x1 = -1.2 ,
x2 = 1.;
y1 = 10.
*
(x2 - x1
*
x1);
y2 = 1. - x1;
run;
proc print data=out1;
run;
The nal parameter estimates .
+
= (1. 1) and the values of the functions
1
=Y1 and
2
=Y2 are
written into an OUT= data set, shown in Output 6.4.1. Since OUTDER=0 is the default, the OUT=
data set does not contain the Jacobian matrix.
Output 6.4.1 Solution in an OUT= Data Set
Obs _OBS_ _TYPE_ y1 y2 x2 x1
1 1 0 -2.2204E-16 1 1
Next, the procedure reads the optimal parameter estimates from the EST data set and the model from
the MODEL data set. It does not do any optimization (TECH=NONE), but it saves the Jacobian
matrix to the OUT=OUT2 data set because of the option OUTDER=1. It also displays the Jacobian
matrix because of the option PJAC; the Jacobian matrix is shown in Output 6.4.2. Output 6.4.3 shows
the contents of the OUT2 data set, which also contains the Jacobian matrix.
proc nlp tech=none model=model inest=est out=out2 outder=1 pjac PHISTORY;
lsq y1 y2;
parms x1 x2;
run;
proc print data=out2;
run;
410 ! Chapter 6: The NLP Procedure
Output 6.4.2 Jacobian Matrix Output
PROC NLP: Least Squares Minimization
Jacobian Matrix
x1 x2
-20 10
-1 0
Output 6.4.3 Jacobian Matrix in an OUT= Data Set
Obs _OBS_ _TYPE_ y1 y2 _WRT_ x2 x1
1 1 0 -0 1 1
2 1 ANALYTIC 10 0 x2 1 1
3 1 ANALYTIC -20 -1 x1 1 1
Example 6.5: Approximate Standard Errors
The NLP procedure provides a variety of ways for estimating parameters in nonlinear statistical
models and for obtaining approximate standard errors and covariance matrices for the estimators.
These methods are illustrated by estimating the mean of a random sample from a normal distribution
with mean j and standard deviation o. The simplicity of the example makes it easy to compare the
results of different methods in NLP with the usual estimator, the sample mean.
The following data step is used:
data x;
input x @@;
datalines;
1 3 4 5 7
;
The standard error of the mean, computed with n 1 degrees of freedom, is 1. The usual maximum-
likelihood approximation to the standard error of the mean, using a variance divisor of n rather than
n 1, is 0.894427.
The sample mean is a least squares estimator, so it can be computed using an LSQ statement.
Moreover, since this model is linear, the Hessian matrix and crossproduct Jacobian matrix are
identical, and all three versions of the COV= option yield the same variance and standard error of
the mean. Note that COV=j means that the crossproduct Jacobian is used. This is chosen because it
requires the least computation.
Example 6.5: Approximate Standard Errors ! 411
proc nlp data=x cov=j pstderr pshort PHISTORY;
lsq resid;
parms mean=0;
resid=x-mean;
run;
The results are the same as the usual estimates.
Output 6.5.1 Parameter Estimates
PROC NLP: Least Squares Minimization
Optimization Results
Parameter Estimates
Approx Approx
N Parameter Estimate Std Err t Value Pr > |t|
1 mean 4.000000 1.000000 4.000000 0.016130
Optimization Results
Parameter Estimates
Gradient
Objective
N Parameter Function
1 mean 8.881784E-15
Value of Objective Function = 10
PROC NLP can also compute maximum-likelihood estimates of j and o. In this case it is convenient
to minimize the negative log likelihood. To get correct standard errors for maximum-likelihood
estimators, the SIGSQ=1 option is required. The following program shows COV=1 but the output
that follows has COV=2 and COV=3.
proc nlp data=x cov=1 sigsq=1 pstderr phes pcov pshort;
min nloglik;
parms mean=0, sigma=1;
bounds 1e-12 < sigma;
nloglik=.5
*
((x-mean)/sigma)
**
2 + log(sigma);
run;
The variance divisor is n instead of n 1, so the standard error of the mean is 0.894427 instead of 1.
The standard error of the mean is the same with all six types of covariance matrix, but the standard
error of the standard deviation varies. The sampling distribution of the standard deviation depends
on the higher moments of the population distribution, so different methods of estimation can produce
markedly different estimates of the standard error of the standard deviation.
Output 6.5.2 shows the output when COV=1, Output 6.5.3 shows the output when COV=2, and
Output 6.5.4 shows the output when COV=3.
412 ! Chapter 6: The NLP Procedure
Output 6.5.2 Solution for COV=1
PROC NLP: Nonlinear Minimization
Optimization Results
Parameter Estimates
Approx Approx
N Parameter Estimate Std Err t Value Pr > |t|
1 mean 4.000000 0.894427 4.472136 0.006566
2 sigma 2.000000 0.458258 4.364358 0.007260
Optimization Results
Parameter Estimates
Gradient
Objective
N Parameter Function
1 mean 1.331492E-10
2 sigma -5.606415E-9
Value of Objective Function = 5.9657359028
Hessian Matrix
mean sigma
mean 1.2500000028 -1.33149E-10
sigma -1.33149E-10 2.500000014
Determinant = 3.1250000245
Matrix has Only Positive Eigenvalues
Covariance Matrix 1: M = (NOBS/d)
inv(G) JJ(f) inv(G)
mean sigma
mean 0.8 1.906775E-11
sigma 1.906775E-11 0.2099999991
Factor sigm = 1
Determinant = 0.1679999993
Matrix has Only Positive Eigenvalues
Example 6.5: Approximate Standard Errors ! 413
Output 6.5.3 Solution for COV=2
PROC NLP: Nonlinear Minimization
Optimization Results
Parameter Estimates
Approx Approx
N Parameter Estimate Std Err t Value Pr > |t|
1 mean 4.000000 0.894427 4.472136 0.006566
2 sigma 2.000000 0.632456 3.162278 0.025031
Optimization Results
Parameter Estimates
Gradient
Objective
N Parameter Function
1 mean 1.331492E-10
2 sigma -5.606415E-9
Value of Objective Function = 5.9657359028
Hessian Matrix
mean sigma
mean 1.2500000028 -1.33149E-10
sigma -1.33149E-10 2.500000014
Determinant = 3.1250000245
Matrix has Only Positive Eigenvalues
Covariance Matrix 2: H = (NOBS/d) inv(G)
mean sigma
mean 0.7999999982 4.260769E-11
sigma 4.260769E-11 0.3999999978
Factor sigm = 1
Determinant = 0.3199999975
Matrix has Only Positive Eigenvalues
414 ! Chapter 6: The NLP Procedure
Output 6.5.4 Solution for COV=3
PROC NLP: Nonlinear Minimization
Optimization Results
Parameter Estimates
Approx Approx
N Parameter Estimate Std Err t Value Pr > |t|
1 mean 4.000000 0.509136 7.856442 0.000537
2 sigma 2.000000 0.419936 4.762634 0.005048
Optimization Results
Parameter Estimates
Gradient
Objective
N Parameter Function
1 mean 1.338402E-10
2 sigma -5.940302E-9
Value of Objective Function = 5.9657359028
Hessian Matrix
mean sigma
mean 1.2500000028 -1.33149E-10
sigma -1.33149E-10 2.500000014
Determinant = 3.1250000245
Matrix has Only Positive Eigenvalues
Covariance Matrix 3: J = (1/d) inv(W)
mean sigma
mean 0.2592197879 1.091093E-11
sigma 1.091093E-11 0.1763460041
Factor sigm = 0.2
Determinant = 0.0457123738
Matrix has Only Positive Eigenvalues
Under normality, the maximum-likelihood estimators of j and o are independent, as indicated by the
diagonal Hessian matrix in the previous example. Hence, the maximum-likelihood estimate of j can
be obtained by using any xed value for o, such as 1. However, if the xed value of o differs from
Example 6.5: Approximate Standard Errors ! 415
the actual maximum-likelihood estimate (in this case 2), the model is misspecied and the standard
errors obtained with COV=2 or COV=3 are incorrect. It is therefore necessary to use COV=1, which
yields consistent estimates of the standard errors under a variety of forms of misspecication of the
error distribution.
proc nlp data=x cov=1 sigsq=1 pstderr pcov pshort;
min sqresid;
parms mean=0;
sqresid=.5
*
(x-mean)
**
2;
run;
This formulation produces the same standard error of the mean, 0.894427 (see Output 6.5.5).
Output 6.5.5 Solution for Fixed o and COV=1
PROC NLP: Nonlinear Minimization
Optimization Results
Parameter Estimates
Approx Approx
N Parameter Estimate Std Err t Value Pr > |t|
1 mean 4.000000 0.894427 4.472136 0.006566
Optimization Results
Parameter Estimates
Gradient
Objective
N Parameter Function
1 mean 0
Value of Objective Function = 10
Covariance Matrix
1: M = (NOBS/d) inv(G)
JJ(f) inv(G)
mean
mean 0.8
Factor sigm = 1
The maximum-likelihood formulation with xed o is actually a least squares problem. The objective
function, parameter estimates, and Hessian matrix are the same as those in the rst example in
this section using the LSQ statement. However, the Jacobian matrix is different, each row being
multiplied by twice the residual. To treat this formulation as a least squares problem, the SIGSQ=1
option can be omitted. But since the Jacobian is not the same as in the formulation using the LSQ
statement, the COV=1 | M and COV=3 | J options, which use the Jacobian, do not yield correct
416 ! Chapter 6: The NLP Procedure
standard errors. The correct standard error is obtained with COV=2 | H, which uses only the Hessian
matrix:
proc nlp data=x cov=2 pstderr pcov pshort;
min sqresid;
parms mean=0;
sqresid=.5
*
(x-mean)
**
2;
run;
The results are the same as in the rst example.
Output 6.5.6 Solution for Fixed o and COV=2
PROC NLP: Nonlinear Minimization
Optimization Results
Parameter Estimates
Approx Approx
N Parameter Estimate Std Err t Value Pr > |t|
1 mean 4.000000 0.500000 8.000000 0.001324
Optimization Results
Parameter Estimates
Gradient
Objective
N Parameter Function
1 mean 0
Value of Objective Function = 10
Covariance Matrix 2:
H = (NOBS/d) inv(G)
mean
mean 0.25
Factor sigm = 1.25
In summary, to obtain appropriate standard errors for least squares estimates, you can use the LSQ
statement with any of the COV= options, or you can use the MIN statement with COV=2. To obtain
appropriate standard errors for maximum-likelihood estimates, you can use the MIN statement with
the negative log likelihood or the MAX statement with the log likelihood, and in either case you can
use any of the COV= options provided that you specify SIGSQ=1. You can also use a log-likelihood
function with a misspecied scale parameter provided that you use SIGSQ=1 and COV=1. For
nonlinear models, all of these methods yield approximations based on asymptotic theory, and should
therefore be interpreted cautiously.
Example 6.6: Maximum Likelihood Weibull Estimation ! 417
Example 6.6: Maximum Likelihood Weibull Estimation
Two-Parameter Weibull Estimation
The following data are taken from Lawless (1982, p. 193) and represent the number of days it took
rats painted with a carcinogen to develop carcinoma. The last two observations are censored data
from a group of 19 rats:
data pike;
input days cens @@;
datalines;
143 0 164 0 188 0 188 0
190 0 192 0 206 0 209 0
213 0 216 0 220 0 227 0
230 0 234 0 246 0 265 0
304 0 216 1 244 1
;
Suppose that you want to show how to compute the maximum likelihood estimates of the scale
parameter o ( in Lawless), the shape parameter c ( in Lawless), and the location parameter 0
(j in Lawless). The observed likelihood function of the three-parameter Weibull transformation
(Lawless 1982, p. 191) is
1(0. o. c) =
c
n
o
n

iT
_
t
i
0
o
_
c-1
]

i=1
exp
_

_
t
i
0
o
_
c
_
and the log likelihood is
l(0. o. c) = mlog c mc log o (c 1)

iT
log(t
i
0)
]

i=1
_
t
i
0
o
_
c
The log likelihood function can be evaluated only for o > 0, c > 0, and 0 < min
i
t
i
. In the
estimation process, you must enforce these conditions using lower and upper boundary constraints.
The three-parameter Weibull estimation can be numerically difcult, and it usually pays off to
provide good initial estimates. Therefore, you rst estimate o and c of the two-parameter Weibull
distribution for constant 0 = 0. You then use the optimal parameters o and c as starting values for
the three-parameter Weibull estimation.
Although the use of an INEST= data set is not really necessary for this simple example, it illustrates
how it is used to specify starting values and lower boundary constraints:
data par1(type=est);
keep _type_ sig c theta;
_type_='parms'; sig = .5;
c = .5; theta = 0; output;
_type_='lb'; sig = 1.0e-6;
c = 1.0e-6; theta = .; output;
run;
418 ! Chapter 6: The NLP Procedure
The following PROC NLP call species the maximization of the log likelihood function for the
two-parameter Weibull estimation for constant 0 = 0:
proc nlp data=pike tech=tr inest=par1 outest=opar1
outmodel=model cov=2 vardef=n pcov phes;
max logf;
parms sig c;
profile sig c / alpha = .9 to .1 by -.1 .09 to .01 by -.01;
x_th = days - theta;
s = - (x_th / sig)
**
c;
if cens=0 then s + log(c) - c
*
log(sig) + (c-1)
*
log(x_th);
logf = s;
run;
After a few iterations you obtain the solution given in Output 6.6.1.
Output 6.6.1 Optimization Results
PROC NLP: Nonlinear Maximization
Optimization Results
Parameter Estimates
Approx Approx
N Parameter Estimate Std Err t Value Pr > |t|
1 sig 234.318611 9.645908 24.292021 9.050475E-16
2 c 6.083147 1.068229 5.694611 0.000017269
Optimization Results
Parameter Estimates
Gradient
Objective
N Parameter Function
1 sig 1.3372183E-9
2 c -7.859277E-9
Value of Objective Function = -88.23273515
Since the gradient has only small elements and the Hessian (shown in Output 6.6.2) is negative
denite (has only negative eigenvalues), the solution denes an isolated maximum point.
Example 6.6: Maximum Likelihood Weibull Estimation ! 419
Output 6.6.2 Hessian Matrix at .
+
PROC NLP: Nonlinear Maximization
Hessian Matrix
sig c
sig -0.011457556 0.0257527577
c 0.0257527577 -0.934221388
Determinant = 0.0100406894
Matrix has Only Negative Eigenvalues
The square roots of the diagonal elements of the approximate covariance matrix of parameter
estimates are the approximate standard errors (ASEs). The covariance matrix is given in Output 6.6.3.
Output 6.6.3 Covariance Matrix
PROC NLP: Nonlinear Maximization
Covariance Matrix 2:
H = (NOBS/d) inv(G)
sig c
sig 93.043549863 2.5648395794
c 2.5648395794 1.141112488
Factor sigm = 1
Determinant = 99.594754608
Matrix has 2 Positive Eigenvalue(s)
The condence limits in Output 6.6.4 correspond to the values in the PROFILE statement.
420 ! Chapter 6: The NLP Procedure
Output 6.6.4 Condence Limits
PROC NLP: Nonlinear Maximization
Wald and PL Confidence Limits
Profile Likelihood
N Parameter Estimate Alpha Confidence Limits
1 sig 234.318611 0.900000 233.111324 235.532695
1 sig . 0.800000 231.886549 236.772876
1 sig . 0.700000 230.623280 238.063824
1 sig . 0.600000 229.292797 239.436639
1 sig . 0.500000 227.855829 240.935290
1 sig . 0.400000 226.251597 242.629201
1 sig . 0.300000 224.372260 244.643392
1 sig . 0.200000 221.984557 247.278423
1 sig . 0.100000 218.390824 251.394102
1 sig . 0.090000 217.884162 251.987489
1 sig . 0.080000 217.326988 252.645278
1 sig . 0.070000 216.708814 253.383546
1 sig . 0.060000 216.008815 254.228034
1 sig . 0.050000 215.199301 255.215496
1 sig . 0.040000 214.230116 256.411041
1 sig . 0.030000 213.020874 257.935686
1 sig . 0.020000 211.369067 260.066128
1 sig . 0.010000 208.671091 263.687174
2 c 6.083147 0.900000 5.950029 6.217752
2 c . 0.800000 5.815559 6.355576
2 c . 0.700000 5.677909 6.499187
2 c . 0.600000 5.534275 6.651789
2 c . 0.500000 5.380952 6.817880
2 c . 0.400000 5.212344 7.004485
2 c . 0.300000 5.018784 7.225733
2 c . 0.200000 4.776379 7.506166
2 c . 0.100000 4.431310 7.931669
2 c . 0.090000 4.382687 7.991457
2 c . 0.080000 4.327815 8.056628
2 c . 0.070000 4.270773 8.129238
2 c . 0.060000 4.207130 8.211221
2 c . 0.050000 4.134675 8.306218
2 c . 0.040000 4.049531 8.418782
2 c . 0.030000 3.945037 8.559677
2 c . 0.020000 3.805759 8.749130
2 c . 0.010000 3.588814 9.056751
Example 6.6: Maximum Likelihood Weibull Estimation ! 421
Output 6.6.4 continued
PROC NLP: Nonlinear Maximization
Wald and PL Confidence Limits
Wald Confidence Limits
233.106494 235.530729
231.874849 236.762374
230.601846 238.035377
229.260292 239.376931
227.812545 240.824678
226.200410 242.436813
224.321270 244.315953
221.956882 246.680341
218.452504 250.184719
217.964960 250.672263
217.431654 251.205569
216.841087 251.796136
216.176649 252.460574
215.412978 253.224245
214.508337 254.128885
213.386118 255.251105
211.878873 256.758350
209.472398 259.164825
5.948912 6.217382
5.812514 6.353780
5.671537 6.494757
5.522967 6.643327
5.362638 6.803656
5.184103 6.982191
4.975999 7.190295
4.714157 7.452137
4.326067 7.840227
4.272075 7.894220
4.213014 7.953280
4.147612 8.018682
4.074029 8.092265
3.989457 8.176837
3.889274 8.277021
3.764994 8.401300
3.598076 8.568219
3.331572 8.834722
Three-Parameter Weibull Estimation
You now prepare for the three-parameter Weibull estimation by using PROC UNIVARIATE to obtain
the smallest data value for the upper boundary constraint for 0. For this small problem, you can do
this much more simply by just using a value slightly smaller than the minimum data value 143.
422 ! Chapter 6: The NLP Procedure
/
*
Calculate upper bound for theta parameter
*
/
proc univariate data=pike noprint;
var days;
output out=stats n=nobs min=minx range=range;
run;
data stats;
set stats;
keep _type_ theta;
/
*
1. write parms observation
*
/
theta = minx - .1
*
range;
if theta < 0 then theta = 0;
_type_ = 'parms';
output;
/
*
2. write ub observation
*
/
theta = minx
*
(1 - 1e-4);
_type_ = 'ub';
output;
run;
The data set PAR2 species the starting values and the lower and upper bounds for the three-parameter
Weibull problem:
proc sort data=opar1;
by _type_;
run;
data par2(type=est);
merge opar1(drop=theta) stats;
by _type_;
keep _type_ sig c theta;
if _type_ in ('parms' 'lowerbd' 'ub');
run;
The following PROC NLP call uses the MODEL= input data set containing the log likelihood
function that was saved during the two-parameter Weibull estimation:
proc nlp data=pike tech=tr inest=par2 outest=opar2
model=model cov=2 vardef=n pcov phes;
max logf;
parms sig c theta;
profile sig c theta / alpha = .5 .1 .05 .01;
run;
After a few iterations, you obtain the solution given in Output 6.6.5.
Example 6.6: Maximum Likelihood Weibull Estimation ! 423
Output 6.6.5 Optimization Results
PROC NLP: Nonlinear Maximization
Optimization Results
Parameter Estimates
Approx Approx
N Parameter Estimate Std Err t Value Pr > |t|
1 sig 108.382690 32.573321 3.327345 0.003540
2 c 2.711476 1.058757 2.560998 0.019108
3 theta 122.025982 28.692364 4.252908 0.000430
Optimization Results
Parameter Estimates
Gradient
Objective
N Parameter Function
1 sig -3.132469E-9
2 c -0.000000487
3 theta -6.760135E-8
Value of Objective Function = -87.32424712
From inspecting the rst- and second-order derivatives at the optimal solution, you can verify that
you have obtained an isolated maximum point. The Hessian matrix is shown in Output 6.6.6.
Output 6.6.6 Hessian Matrix
PROC NLP: Nonlinear Maximization
Hessian Matrix
sig c theta
sig -0.010639963 0.045388887 -0.01003374
c 0.045388887 -4.07872453 -0.083028097
theta -0.01003374 -0.083028097 -0.014752243
Determinant = 0.0000502152
Matrix has Only Negative Eigenvalues
The square roots of the diagonal elements of the approximate covariance matrix of parameter
estimates are the approximate standard errors. The covariance matrix is given in Output 6.6.7.
424 ! Chapter 6: The NLP Procedure
Output 6.6.7 Covariance Matrix
PROC NLP: Nonlinear Maximization
Covariance Matrix 2: H = (NOBS/d) inv(G)
sig c theta
sig 1060.9795634 29.924959631 -890.0493649
c 29.924959631 1.1209334438 -26.66229938
theta -890.0493649 -26.66229938 823.21458961
Factor sigm = 1
Determinant = 19914.591664
Matrix has 3 Positive Eigenvalue(s)
The difference between the Wald and prole CLs for parameter PHI2 are remarkable, especially for
the upper 95% and 99% limits, as shown in Output 6.6.8.
Example 6.7: Simple Pooling Problem ! 425
Output 6.6.8 Condence Limits
PROC NLP: Nonlinear Maximization
Wald and PL Confidence Limits
Profile Likelihood
N Parameter Estimate Alpha Confidence Limits
1 sig 108.382732 0.500000 91.811562 141.564605
1 sig . 0.100000 76.502373 .
1 sig . 0.050000 72.215845 .
1 sig . 0.010000 64.262384 .
2 c 2.711477 0.500000 2.139297 3.704052
2 c . 0.100000 1.574162 9.250072
2 c . 0.050000 1.424853 19.516170
2 c . 0.010000 1.163096 19.540685
3 theta 122.025942 0.500000 91.027144 135.095454
3 theta . 0.100000 . 141.833769
3 theta . 0.050000 . 142.512603
3 theta . 0.010000 . 142.967407
Wald and PL Confidence Limits
Wald Confidence Limits
86.412310 130.353154
54.804263 161.961200
44.540049 172.225415
24.479224 192.286240
1.997355 3.425599
0.969973 4.452981
0.636347 4.786607
-0.015706 5.438660
102.673186 141.378698
74.831079 142.985700
65.789795 142.985700
48.119116 142.985700
Example 6.7: Simple Pooling Problem
The following optimization problem is discussed in Haverly (1978) and in Liebman et al. (1986,
pp. 127128). Two liquid chemicals, X and Y , are produced by the pooling and blending of three
input liquid chemicals, , T, and C. You know the sulfur impurity amounts of the input chemicals,
and you have to respect upper limits of the sulfur impurity amounts of the output chemicals. The
sulfur concentrations and the prices of the input and output chemicals are:
v Chemical : Concentration = 3%, Price= $6
v Chemical T: Concentration = 1%, Price= $16
v Chemical C: Concentration = 2%, Price= $10
426 ! Chapter 6: The NLP Procedure
v Chemical X: Concentration _ 2.5%, Price= $9
v Chemical Y : Concentration _ 1.5%, Price= $15
The problem is complicated by the fact that the two input chemicals and T are available only as a
mixture (they are either shipped together or stored together). Because the amounts of and T are
unknown, the sulfur concentration of the mixture is also unknown.
C
B
A
3% S
for $ 6
1% S
for $ 16
Pool Blend X
_ 2.5 % S
for $ 9
X _ 100
Pool to X
Pool to Y
Blend Y
_ 1.5 % S
for $ 15
Y _ 200
2% S
for $ 10
C to Y
C to X
You know customers will buy no more than 100 units of X and 200 units of Y. The problem is
determining how to operate the pooling and blending of the chemicals to maximize the prot. The
objective function for the prot is
prot = cost(.) amount(.) cost(,) amount(,)
cost(a) amount(a) cost(b) amount(b) cost(c) amount(c)
There are three groups of constraints:
1. The rst group of constraint functions is the mass balance restrictions illustrated by the graph.
These are four linear equality constraints:
v amount(a) amount(b) = pool_to_x pool_to_y
v pool_to_x c_to_x = amount(.)
v pool_to_y c_to_y = amount(,)
v amount(c) = c_to_x c_to_y
2. You introduce a new variable, ool_s, that represents the sulfur concentration of the pool.
Using ool_s and the sulfur concentration of C (2%), you obtain two nonlinear inequality
constraints for the sulfur concentrations of X and Y , one linear equality constraint for the
sulfur balance, and lower and upper boundary restrictions for ool_s:
Example 6.7: Simple Pooling Problem ! 427
v pool_s pool_to_x 2 c_to_x _ 2.5 amount(.)
v pool_s pool_to_y 2 c_to_y _ 1.5 amount(,)
v 3 amount(a) 1 amount(b) = pool_s (amount(a) amount(b))
v 1 _ pool_s _ 3
3. The last group assembles the remaining boundary constraints. First, you do not want to
produce more than you can sell; and nally, all variables must be nonnegative:
v amount(.) _ 100. amount(,) _ 200
v amount(a). amount(b). amount(c). amount(.). amount(,) _ 0
v pool_to_x. pool_to_y. c_to_x. c_to_y _ 0
There exist several local optima to this problem that can be found by specifying different starting
points. Using the starting point with all variables equal to 1 (specied with a PARMS statement),
PROC NLP nds a solution with prot = 400:
proc nlp all;
parms amountx amounty amounta amountb amountc
pooltox pooltoy ctox ctoy pools = 1;
bounds 0 <= amountx amounty amounta amountb amountc,
amountx <= 100,
amounty <= 200,
0 <= pooltox pooltoy ctox ctoy,
1 <= pools <= 3;
lincon amounta + amountb = pooltox + pooltoy,
pooltox + ctox = amountx,
pooltoy + ctoy = amounty,
ctox + ctoy = amountc;
nlincon nlc1-nlc2 >= 0.,
nlc3 = 0.;
max f;
costa = 6; costb = 16; costc = 10;
costx = 9; costy = 15;
f = costx
*
amountx + costy
*
amounty
- costa
*
amounta - costb
*
amountb - costc
*
amountc;
nlc1 = 2.5
*
amountx - pools
*
pooltox - 2.
*
ctox;
nlc2 = 1.5
*
amounty - pools
*
pooltoy - 2.
*
ctoy;
nlc3 = 3
*
amounta + amountb - pools
*
(amounta + amountb);
run;
The specied starting point was not feasible with respect to the linear equality constraints; therefore,
a starting point is generated that satises linear and boundary constraints. Output 6.7.1 gives the
starting parameter estimates.
428 ! Chapter 6: The NLP Procedure
Output 6.7.1 Starting Estimates
PROC NLP: Nonlinear Maximization
Optimization Start
Parameter Estimates
Gradient Gradient Lower
Objective Lagrange Bound
N Parameter Estimate Function Function Constraint
1 amountx 1.363636 9.000000 -0.843698 0
2 amounty 1.363636 15.000000 -0.111882 0
3 amounta 0.818182 -6.000000 -0.430733 0
4 amountb 0.818182 -16.000000 -0.542615 0
5 amountc 1.090909 -10.000000 0.017768 0
6 pooltox 0.818182 0 -0.669628 0
7 pooltoy 0.818182 0 -0.303720 0
8 ctox 0.545455 0 -0.174070 0
9 ctoy 0.545455 0 0.191838 0
10 pools 2.000000 0 0.068372 1.000000
Optimization Start
Parameter Estimates
Upper
Bound
N Parameter Constraint
1 amountx 100.000000
2 amounty 200.000000
3 amounta .
4 amountb .
5 amountc .
6 pooltox .
7 pooltoy .
8 ctox .
9 ctoy .
10 pools 3.000000
Value of Objective Function = 3.8181818182
Value of Lagrange Function = -2.866739915
Example 6.7: Simple Pooling Problem ! 429
Output 6.7.1 continued
PROC NLP: Nonlinear Maximization
Optimization Results
Parameter Estimates
Gradient Gradient Active
Objective Lagrange Bound
N Parameter Estimate Function Function Constraint
1 amountx -1.40474E-11 9.000000 0 Lower BC
2 amounty 200.000000 15.000000 0 Upper BC
3 amounta 1.027701E-16 -6.000000 0 Lower BC
4 amountb 100.000000 -16.000000 -1.77636E-15
5 amountc 100.000000 -10.000000 1.776357E-15
6 pooltox 7.024003E-12 0 0 Lower BC
7 pooltoy 100.000000 0 -1.06581E-14
8 ctox -2.10714E-11 0 5.329071E-15 Lower BC LinDep
9 ctoy 100.000000 0 1.776357E-15
10 pools 1.000000 0 0 Lower BC LinDep
The starting point satises the four equality constraints, as shown in Output 6.7.2. The nonlinear
constraints are given in Output 6.7.3.
Output 6.7.2 Linear Constraints
PROC NLP: Nonlinear Maximization
Linear Constraints
1 -3.331E-16 : ACT 0 == + 1.0000
*
amounta + 1.0000
*
amountb
- 1.0000
*
pooltox - 1.0000
*
pooltoy
2 1.1102E-16 : ACT 0 == - 1.0000
*
amountx + 1.0000
*
pooltox
+ 1.0000
*
ctox
3 1.1102E-16 : ACT 0 == - 1.0000
*
amounty + 1.0000
*
pooltoy
+ 1.0000
*
ctoy
4 1.1102E-16 : ACT 0 == - 1.0000
*
amountc + 1.0000
*
ctox
+ 1.0000
*
ctoy
Output 6.7.3 Nonlinear Constraints
PROC NLP: Nonlinear Maximization
Values of Nonlinear Constraints
Lagrange
Constraint Value Residual Multiplier
[ 5 ] nlc3 0 0 4.9441 Active NLEC
[ 6 ] nlc1_G 0.6818 0.6818 .
[ 7 ] nlc2_G -0.6818 -0.6818 -9.8046 Violat. NLIC
430 ! Chapter 6: The NLP Procedure
Output 6.7.3 continued
PROC NLP: Nonlinear Maximization
Values of Nonlinear Constraints
Lagrange
Constraint Value Residual Multiplier
[ 5 ] nlc3 0 0 6.0000 Active NLEC
[ 6 ] nlc1_G 4.04E-16 4.04E-16 . Active NLIC LinDep
[ 7 ] nlc2_G -284E-16 -284E-16 -6.0000 Active NLIC
Output 6.7.4 shows the settings of some important PROC NLP options.
Output 6.7.4 Options
PROC NLP: Nonlinear Maximization
Minimum Iterations 0
Maximum Iterations 200
Maximum Function Calls 500
Iterations Reducing Constraint Violation 20
ABSGCONV Gradient Criterion 0.00001
GCONV Gradient Criterion 1E-8
ABSFCONV Function Criterion 0
FCONV Function Criterion 2.220446E-16
FCONV2 Function Criterion 1E-6
FSIZE Parameter 0
ABSXCONV Parameter Change Criterion 0
XCONV Parameter Change Criterion 0
XSIZE Parameter 0
ABSCONV Function Criterion 1.340781E154
Line Search Method 2
Starting Alpha for Line Search 1
Line Search Precision LSPRECISION 0.4
DAMPSTEP Parameter for Line Search .
FD Derivatives: Accurate Digits in Obj.F 15.653559775
FD Derivatives: Accurate Digits in NLCon 15.653559775
Singularity Tolerance (SINGULAR) 1E-8
Constraint Precision (LCEPS) 1E-8
Linearly Dependent Constraints (LCSING) 1E-8
Releasing Active Constraints (LCDEACT) .
The iteration history, given in Output 6.7.5, does not show any problems.
Example 6.7: Simple Pooling Problem ! 431
Output 6.7.5 Iteration History
PROC NLP: Nonlinear Maximization
Dual Quasi-Newton Optimization
Modified VMCWD Algorithm of Powell (1978, 1982)
Dual Broyden - Fletcher - Goldfarb - Shanno Update (DBFGS)
Lagrange Multiplier Update of Powell(1982)
Maximum
Gradient
Element
Maximum Predicted of the
Function Objective Constraint Function Step Lagrange
Iter Restarts Calls Function Violation Reduction Size Function
1 0 19 -1.42400 0.00962 6.9131 1.000 0.783
2' 0 20 2.77026 0.0166 5.3770 1.000 2.629
3 0 21 7.08706 0.1409 7.1965 1.000 9.452
4' 0 22 11.41264 0.0583 15.5769 1.000 23.390
5' 0 23 24.84613 8.88E-16 496.1 1.000 147.6
6 0 24 378.22825 147.4 3316.7 1.000 840.4
7' 0 25 307.56810 50.9339 607.9 1.000 27.143
8' 0 26 347.24468 1.8329 21.9883 1.000 28.482
9' 0 27 349.49255 0.00915 7.1833 1.000 28.289
10' 0 28 356.58341 0.1083 50.2566 1.000 27.479
11' 0 29 388.70731 2.4280 24.7996 1.000 21.114
12' 0 30 389.30118 0.0157 10.0475 1.000 18.647
13' 0 31 399.19240 0.7997 11.1862 1.000 0.416
14' 0 32 400.00000 0.0128 0.1533 1.000 0.00087
15' 0 33 400.00000 7.38E-11 2.44E-10 1.000 365E-12
Optimization Results
Iterations 15 Function Calls 34
Gradient Calls 18 Active Constraints 10
Objective Function 400 Maximum Constraint 7.381118E-11
Violation
Maximum Projected Gradient 0 Value Lagrange Function -400
Maximum Gradient of the 1.065814E-14 Slope of Search Direction -2.43574E-10
Lagran Func
FCONV2 convergence criterion satisfied.
The optimal solution in Output 6.7.6 shows that to obtain the maximum prot of $400, you need
only to produce the maximum 200 units of blending Y and no units of blending X.
432 ! Chapter 6: The NLP Procedure
Output 6.7.6 Optimization Solution
PROC NLP: Nonlinear Maximization
Optimization Results
Parameter Estimates
Gradient Gradient Active
Objective Lagrange Bound
N Parameter Estimate Function Function Constraint
1 amountx -1.40474E-11 9.000000 0 Lower BC
2 amounty 200.000000 15.000000 0 Upper BC
3 amounta 1.027701E-16 -6.000000 0 Lower BC
4 amountb 100.000000 -16.000000 -1.77636E-15
5 amountc 100.000000 -10.000000 1.776357E-15
6 pooltox 7.024003E-12 0 0 Lower BC
7 pooltoy 100.000000 0 -1.06581E-14
8 ctox -2.10714E-11 0 5.329071E-15 Lower BC LinDep
9 ctoy 100.000000 0 1.776357E-15
10 pools 1.000000 0 0 Lower BC LinDep
Value of Objective Function = 400
Value of Lagrange Function = 400
The constraints are satised at the solution, as shown in Output 6.7.7
Output 6.7.7 Linear and Nonlinear Constraints at the Solution
PROC NLP: Nonlinear Maximization
Linear Constraints Evaluated at Solution
1 ACT 0 = 0 + 1.0000
*
amounta + 1.0000
*
amountb
- 1.0000
*
pooltox - 1.0000
*
pooltoy
2 ACT -4.481E-17 = 0 - 1.0000
*
amountx + 1.0000
*
pooltox
+ 1.0000
*
ctox
3 ACT 0 = 0 - 1.0000
*
amounty + 1.0000
*
pooltoy
+ 1.0000
*
ctoy
4 ACT 0 = 0 - 1.0000
*
amountc + 1.0000
*
ctox
+ 1.0000
*
ctoy
Values of Nonlinear Constraints
Lagrange
Constraint Value Residual Multiplier
[ 5 ] nlc3 0 0 6.0000 Active NLEC
[ 6 ] nlc1_G 4.04E-16 4.04E-16 . Active NLIC LinDep
[ 7 ] nlc2_G -284E-16 -284E-16 -6.0000 Active NLIC
Example 6.7: Simple Pooling Problem ! 433
Output 6.7.7 continued
Linearly Dependent Active Boundary Constraints
Parameter N Kind
ctox 8 Lower BC
pools 10 Lower BC
Linearly Dependent Gradients of Active Nonlinear Constraints
Parameter N
nlc3 6
The same problem can be specied in many different ways. For example, the following specication
uses an INEST= data set containing the values of the starting point and of the constants COST,
COSTB, COSTC, COSTX, COSTY, CA, CB, CC, and CD:
data init1(type=est);
input _type_ $ amountx amounty amounta amountb
amountc pooltox pooltoy ctox ctoy pools
_rhs_ costa costb costc costx costy
ca cb cc cd;
datalines;
parms 1 1 1 1 1 1 1 1 1 1
. 6 16 10 9 15 2.5 1.5 2. 3.
;
proc nlp inest=init1 all;
parms amountx amounty amounta amountb amountc
pooltox pooltoy ctox ctoy pools;
bounds 0 <= amountx amounty amounta amountb amountc,
amountx <= 100,
amounty <= 200,
0 <= pooltox pooltoy ctox ctoy,
1 <= pools <= 3;
lincon amounta + amountb = pooltox + pooltoy,
pooltox + ctox = amountx,
pooltoy + ctoy = amounty,
ctox + ctoy = amountc;
nlincon nlc1-nlc2 >= 0.,
nlc3 = 0.;
max f;
f = costx
*
amountx + costy
*
amounty
- costa
*
amounta - costb
*
amountb - costc
*
amountc;
nlc1 = ca
*
amountx - pools
*
pooltox - cc
*
ctox;
nlc2 = cb
*
amounty - pools
*
pooltoy - cc
*
ctoy;
nlc3 = cd
*
amounta + amountb - pools
*
(amounta + amountb);
run;
The third specication uses an INEST= data set containing the boundary and linear constraints in
addition to the values of the starting point and of the constants. This specication also writes the
model specication into an OUTMOD= data set:
434 ! Chapter 6: The NLP Procedure
data init2(type=est);
input _type_ $ amountx amounty amounta amountb amountc
pooltox pooltoy ctox ctoy pools
_rhs_ costa costb costc costx costy;
datalines;
parms 1 1 1 1 1 1 1 1 1 1
. 6 16 10 9 15 2.5 1.5 2 3
lowerbd 0 0 0 0 0 0 0 0 0 1
. . . . . . . . . .
upperbd 100 200 . . . . . . . 3
. . . . . . . . . .
eq . . 1 1 . -1 -1 . . .
0 . . . . . . . . .
eq 1 . . . . -1 . -1 . .
0 . . . . . . . . .
eq . 1 . . . . -1 . -1 .
0 . . . . . . . . .
eq . . . . 1 . . -1 -1 .
0 . . . . . . . . .
;
proc nlp inest=init2 outmod=model all;
parms amountx amounty amounta amountb amountc
pooltox pooltoy ctox ctoy pools;
nlincon nlc1-nlc2 >= 0.,
nlc3 = 0.;
max f;
f = costx
*
amountx + costy
*
amounty
- costa
*
amounta - costb
*
amountb - costc
*
amountc;
nlc1 = 2.5
*
amountx - pools
*
pooltox - 2.
*
ctox;
nlc2 = 1.5
*
amounty - pools
*
pooltoy - 2.
*
ctoy;
nlc3 = 3
*
amounta + amountb - pools
*
(amounta + amountb);
run;
The fourth specication not only reads the INEST=INIT2 data set, it also uses the model specication
from the MODEL data set that was generated in the last specication. The PROC NLP call now
contains only the dening variable statements:
proc nlp inest=init2 model=model all;
parms amountx amounty amounta amountb amountc
pooltox pooltoy ctox ctoy pools;
nlincon nlc1-nlc2 >= 0.,
nlc3 = 0.;
max f;
run;
All four specications start with the same starting point of all variables equal to 1 and generate the
same results. However, there exist several local optima to this problem, as is pointed out in Liebman
et al. (1986, p. 130).
Example 6.8: Chemical Equilibrium ! 435
proc nlp inest=init2 model=model all;
parms amountx amounty amounta amountb amountc
pooltox pooltoy ctox ctoy = 0,
pools = 2;
nlincon nlc1-nlc2 >= 0.,
nlc3 = 0.;
max f;
run;
This starting point with all variables equal to 0 is accepted as a local solution with prot = 0, which
minimizes rather than maximizes the prot.
Example 6.8: Chemical Equilibrium
The following example is used in many test libraries for nonlinear programming and was taken
originally from Bracken and McCormick (1968).
The problem is to determine the composition of a mixture of various chemicals satisfying its chemical
equilibrium state. The second law of thermodynamics implies that a mixture of chemicals satises
its chemical equilibrium state (at a constant temperature and pressure) when the free energy of the
mixture is reduced to a minimum. Therefore the composition of the chemicals satisfying its chemical
equilibrium state can be found by minimizing the function of the free energy of the mixture.
Notation:
m number of chemical elements in the mixture
n number of compounds in the mixture
.
}
number of moles for compound , = 1. . . . . n
s total number of moles in the mixture (s =

n
i=1
.
}
)
a
i}
number of atoms of element i in a molecule of compound
b
i
atomic weight of element i in the mixture
Constraints for the Mixture:
v The number of moles must be positive:
.
}
> 0. = 1. . . . . n
v There are m mass balance relationships,
n

}=1
a
i}
.
}
= b
i
. i = 1. . . . . m
436 ! Chapter 6: The NLP Procedure
Objective Function: Total Free Energy of Mixture
(.) =
n

}=1
.
}
_
c
}
ln
_
.
}
s
__
with
c
}
=
_
J

1T
_
}
ln 1
where J

,1T is the model standard free energy function for the th compound (found in tables)
and 1 is the total pressure in atmospheres.
Minimization Problem:
Determine the parameters .
}
that minimize the objective function (.) subject to the nonnegativity
and linear balance constraints.
Numeric Example:
Determine the equilibrium composition of compound
1
2
N
2
H
4

1
2
O
2
at temperature T = 3500

K
and pressure 1 = 750psi.
a
i}
i = 1 i = 2 i = 3
Compound (J

,1T )
}
c
}
H N O
1 H -10.021 -6.089 1
2 H
2
-21.096 -17.164 2
3 H
2
O -37.986 -34.054 2 1
4 N -9.846 -5.914 1
5 N
2
-28.653 -24.721 2
6 NH -18.918 -14.986 1 1
7 NO -28.032 -24.100 1 1
8 O -14.640 -10.708 1
9 O
2
-30.594 -26.662 2
10 OH -26.111 -22.179 1 1
Example Specication:
proc nlp tech=tr pall;
array c[10] -6.089 -17.164 -34.054 -5.914 -24.721
-14.986 -24.100 -10.708 -26.662 -22.179;
array x[10] x1-x10;
min y;
parms x1-x10 = .1;
bounds 1.e-6 <= x1-x10;
lincon 2. = x1 + 2.
*
x2 + 2.
*
x3 + x6 + x10,
1. = x4 + 2.
*
x5 + x6 + x7,
1. = x3 + x7 + x8 + 2.
*
x9 + x10;
s = x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 + x10;
Example 6.8: Chemical Equilibrium ! 437
y = 0.;
do j = 1 to 10;
y = y + x[j]
*
(c[j] + log(x[j] / s));
end;
run;
Displayed Output:
The iteration history given in Output 6.8.1 does not show any problems.
Output 6.8.1 Iteration History
PROC NLP: Nonlinear Minimization
Trust Region Optimization
Without Parameter Scaling
Max Abs Trust
Rest Func Act Objective Obj Fun Gradient Region
Iter arts Calls Con Function Change Element Lambda Radius
1 0 2 3' -47.33412 2.2790 6.0765 2.456 1.000
2 0 3 3' -47.70043 0.3663 8.5592 0.908 0.418
3 0 4 3 -47.73074 0.0303 6.4942 0 0.359
4 0 5 3 -47.73275 0.00201 4.7606 0 0.118
5 0 6 3 -47.73554 0.00279 3.2125 0 0.0168
6 0 7 3 -47.74223 0.00669 1.9552 110.6 0.00271
7 0 8 3 -47.75048 0.00825 1.1157 102.9 0.00563
8 0 9 3 -47.75876 0.00828 0.4165 3.787 0.0116
9 0 10 3 -47.76101 0.00224 0.0716 0 0.0121
10 0 11 3 -47.76109 0.000083 0.00238 0 0.0111
11 0 12 3 -47.76109 9.609E-8 2.733E-6 0 0.00248
Optimization Results
Iterations 11 Function Calls 13
Hessian Calls 12 Active Constraints 3
Objective Function -47.76109086 Max Abs Gradient Element 1.8637499E-6
Lambda 0 Actual Over Pred Change 0
Radius 0.0024776027
GCONV convergence criterion satisfied.
Output 6.8.2 lists the optimal parameters with the gradient.
438 ! Chapter 6: The NLP Procedure
Output 6.8.2 Optimization Results
PROC NLP: Nonlinear Minimization
Optimization Results
Parameter Estimates
Gradient
Objective
N Parameter Estimate Function
1 x1 0.040668 -9.785055
2 x2 0.147730 -19.570110
3 x3 0.783153 -34.792170
4 x4 0.001414 -12.968921
5 x5 0.485247 -25.937841
6 x6 0.000693 -22.753976
7 x7 0.027399 -28.190984
8 x8 0.017947 -15.222060
9 x9 0.037314 -30.444120
10 x10 0.096871 -25.007115
Value of Objective Function = -47.76109086
The three equality constraints are satised at the solution, as shown in Output 6.8.3.
Output 6.8.3 Linear Constraints at Solution
PROC NLP: Nonlinear Minimization
Linear Constraints Evaluated at Solution
1 ACT -3.608E-16 = 2.0000 - 1.0000
*
x1 - 2.0000
*
x2 -
2.0000
*
x3 - 1.0000
*
x6 - 1.0000
*
x10
2 ACT 2.2204E-16 = 1.0000 - 1.0000
*
x4 - 2.0000
*
x5 -
1.0000
*
x6 - 1.0000
*
x7
3 ACT -1.943E-16 = 1.0000 - 1.0000
*
x3 - 1.0000
*
x7 -
1.0000
*
x8 - 2.0000
*
x9 - 1.0000
*
x10
The Lagrange multipliers are given in Output 6.8.4.
Example 6.8: Chemical Equilibrium ! 439
Output 6.8.4 Lagrange Multipliers
PROC NLP: Nonlinear Minimization
First Order Lagrange Multipliers
Lagrange
Active Constraint Multiplier
Linear EC [1] 9.785055
Linear EC [2] 12.968921
Linear EC [3] 15.222060
The elements of the projected gradient must be small to satisfy a necessary rst-order optimality
condition. The projected gradient is given in Output 6.8.5.
Output 6.8.5 Projected Gradient
PROC NLP: Nonlinear Minimization
Projected Gradient
Free Projected
Dimension Gradient
1 4.5770108E-9
2 6.868355E-10
3 -7.283013E-9
4 -0.000001864
5 -0.000001434
6 -0.000001361
7 -0.000000294
The projected Hessian matrix shown in Output 6.8.6 is positive denite, satisfying the second-order
optimality condition.
440 ! Chapter 6: The NLP Procedure
Output 6.8.6 Projected Hessian Matrix
PROC NLP: Nonlinear Minimization
Projected Hessian Matrix
X1 X2 X3 X4
X1 20.903196985 -0.122067474 2.6480263467 3.3439156526
X2 -0.122067474 565.97299938 106.54631863 -83.7084843
X3 2.6480263467 106.54631863 1052.3567179 -115.230587
X4 3.3439156526 -83.7084843 -115.230587 37.529977667
X5 -1.373829641 -37.43971036 182.89278895 -4.621642366
X6 -1.491808185 -36.20703737 175.97949593 -4.574152161
X7 1.1462413516 -16.635529 -57.04158208 10.306551561
Projected Hessian Matrix
X5 X6 X7
X1 -1.373829641 -1.491808185 1.1462413516
X2 -37.43971036 -36.20703737 -16.635529
X3 182.89278895 175.97949593 -57.04158208
X4 -4.621642366 -4.574152161 10.306551561
X5 79.326057844 22.960487404 -12.69831637
X6 22.960487404 66.669897023 -8.121228758
X7 -12.69831637 -8.121228758 14.690478023
The following PROC NLP call uses a specied analytical gradient and the Hessian matrix is computed
by nite-difference approximations based on the analytic gradient:
proc nlp tech=tr fdhessian all;
array c[10] -6.089 -17.164 -34.054 -5.914 -24.721
-14.986 -24.100 -10.708 -26.662 -22.179;
array x[10] x1-x10;
array g[10] g1-g10;
min y;
parms x1-x10 = .1;
bounds 1.e-6 <= x1-x10;
lincon 2. = x1 + 2.
*
x2 + 2.
*
x3 + x6 + x10,
1. = x4 + 2.
*
x5 + x6 + x7,
1. = x3 + x7 + x8 + 2.
*
x9 + x10;
s = x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 + x10;
y = 0.;
do j = 1 to 10;
y = y + x[j]
*
(c[j] + log(x[j] / s));
g[j] = c[j] + log(x[j] / s);
end;
run;
The results are almost identical to those of the previous run.
Example 6.9: Minimize Total Delay in a Network ! 441
Example 6.9: Minimize Total Delay in a Network
The following example is taken from the users guide of GINO (Liebman et al. 1986). A simple
network of ve roads (arcs) can be illustrated by the path diagram:
Figure 6.11 Simple Road Network
1
2
3
4
F F
The ve roads connect four intersections illustrated by numbered nodes. Each minute J vehicles
enter and leave the network. Arc (i. ) refers to the road from intersection i to intersection , and
the parameter .
i}
refers to the ow from i to . The law that trafc owing into each intersection
must also ow out is described by the linear equality constraint

i
.
i}
=

i
.
}i
. = 1. . . . . n
In general, roads also have an upper capacity, which is the number of vehicles which can be handled
per minute. The upper limits c
i}
can be enforced by boundary constraints
0 _ .
i}
_ c
i}
. i. = 1. . . . . n
Finding the maximum ow through a network is equivalent to solving a simple linear optimization
problem, and for large problems, PROC LP or PROC NETFLOW can be used. The objective function
is
max = .
24
.
34
and the constraints are
.
13
= .
32
.
34
.
12
.
32
= .
24
.
12
.
13
= .
24
.
34
0 _ .
12
. .
32
. .
34
_ 10
0 _ .
13
. .
24
_ 30
The three linear equality constraints are linearly dependent. One of them is deleted automatically by
the PROC NLP subroutines. Even though the default technique is used for this small example, any
optimization subroutine can be used.
442 ! Chapter 6: The NLP Procedure
proc nlp all initial=.5;
max y;
parms x12 x13 x32 x24 x34;
bounds x12 <= 10,
x13 <= 30,
x32 <= 10,
x24 <= 30,
x34 <= 10;
/
*
what flows into an intersection must flow out
*
/
lincon x13 = x32 + x34,
x12 + x32 = x24,
x24 + x34 = x12 + x13;
y = x24 + x34 + 0
*
x12 + 0
*
x13 + 0
*
x32;
run;
The iteration history is given in Output 6.9.1, and the optimal solution is given in Output 6.9.2.
Output 6.9.1 Iteration History
PROC NLP: Nonlinear Maximization
Newton-Raphson Ridge Optimization
Without Parameter Scaling
Actual
Max Abs Over
Rest Func Act Objective Obj Fun Gradient Pred
Iter arts Calls Con Function Change Element Ridge Change
1
*
0 2 4 20.25000 19.2500 0.5774 0.0313 0.860
2
*
0 3 5 30.00000 9.7500 0 0.0313 1.683
Optimization Results
Iterations 2 Function Calls 4
Hessian Calls 3 Active Constraints 5
Objective Function 30 Max Abs Gradient Element 0
Ridge 0 Actual Over Pred Change 1.6834532374
All parameters are actively constrained. Optimization cannot proceed.
Example 6.9: Minimize Total Delay in a Network ! 443
Output 6.9.2 Optimization Results
PROC NLP: Nonlinear Maximization
Optimization Results
Parameter Estimates
Gradient Active
Objective Bound
N Parameter Estimate Function Constraint
1 x12 10.000000 0 Upper BC
2 x13 20.000000 0
3 x32 10.000000 0 Upper BC
4 x24 20.000000 1.000000
5 x34 10.000000 1.000000 Upper BC
Value of Objective Function = 30
Finding a trafc pattern that minimizes the total delay to move J vehicles per minute from node
1 to node 4 introduces nonlinearities that, in turn, demand nonlinear optimization techniques. As
trafc volume increases, speed decreases. Let t
i}
be the travel time on arc (i. ) and assume that the
following formulas describe the travel time as decreasing functions of the amount of trafc:
t
12
= 5 0.1.
12
,(1 .
12
,10)
t
13
= .
13
,(1 .
13
,30)
t
32
= 1 .
32
,(1 .
32
,10)
t
24
= .
24
,(1 .
24
,30)
t
34
= 5 0.1.
34
,(1 .
34
,10)
These formulas use the road capacities (upper bounds), assuming J = 5 vehicles per minute have to
be moved through the network. The objective function is now
min = t
12
.
12
t
13
.
13
t
32
.
32
t
24
.
24
t
34
.
34
and the constraints are
.
13
= .
32
.
34
.
12
.
32
= .
24
.
24
.
34
= J = 5
0 _ .
12
. .
32
. .
34
_ 10
0 _ .
13
. .
24
_ 30
444 ! Chapter 6: The NLP Procedure
Again, the default algorithm is used:
proc nlp all initial=.5;
min y;
parms x12 x13 x32 x24 x34;
bounds x12 x13 x32 x24 x34 >= 0;
lincon x13 = x32 + x34, /
*
flow in = flow out
*
/
x12 + x32 = x24,
x24 + x34 = 5; /
*
= f = desired flow
*
/
t12 = 5 + .1
*
x12 / (1 - x12 / 10);
t13 = x13 / (1 - x13 / 30);
t32 = 1 + x32 / (1 - x32 / 10);
t24 = x24 / (1 - x24 / 30);
t34 = 5 + .1
*
x34 / (1 - x34 / 10);
y = t12
*
x12 + t13
*
x13 + t32
*
x32 + t24
*
x24 + t34
*
x34;
run;
The iteration history is given in Output 6.9.3, and the optimal solution is given in Output 6.9.4.
Output 6.9.3 Iteration History
PROC NLP: Nonlinear Minimization
Newton-Raphson Ridge Optimization
Without Parameter Scaling
Actual
Max Abs Over
Rest Func Act Objective Obj Fun Gradient Pred
Iter arts Calls Con Function Change Element Ridge Change
1 0 2 4 40.30303 0.3433 4.44E-16 0 0.508
Optimization Results
Iterations 1 Function Calls 3
Hessian Calls 2 Active Constraints 4
Objective Function 40.303030303 Max Abs Gradient Element 4.440892E-16
Ridge 0 Actual Over Pred Change 0.5083585587
ABSGCONV convergence criterion satisfied.
Example 6.9: Minimize Total Delay in a Network ! 445
Output 6.9.4 Optimization Results
PROC NLP: Nonlinear Minimization
Optimization Results
Parameter Estimates
Gradient Active
Objective Bound
N Parameter Estimate Function Constraint
1 x12 2.500000 5.777778
2 x13 2.500000 5.702479
3 x32 -2.77556E-17 1.000000 Lower BC
4 x24 2.500000 5.702479
5 x34 2.500000 5.777778
Value of Objective Function = 40.303030303
The active constraints and corresponding Lagrange multiplier estimates (costs) are given in Out-
put 6.9.5 and Output 6.9.6, respectively.
Output 6.9.5 Linear Constraints at Solution
PROC NLP: Nonlinear Minimization
Linear Constraints Evaluated at Solution
1 ACT 0 = 0 + 1.0000
*
x13 - 1.0000
*
x32 -
1.0000
*
x34
2 ACT 4.4409E-16 = 0 + 1.0000
*
x12 + 1.0000
*
x32 -
1.0000
*
x24
3 ACT 0 = -5.0000 + 1.0000
*
x24 + 1.0000
*
x34
Output 6.9.6 Lagrange Multipliers at Solution
PROC NLP: Nonlinear Minimization
First Order Lagrange Multipliers
Lagrange
Active Constraint Multiplier
Lower BC x32 0.924702
Linear EC [1] 5.702479
Linear EC [2] 5.777778
Linear EC [3] 11.480257
446 ! Chapter 6: The NLP Procedure
Output 6.9.7 shows that the projected gradient is very small, satisfying the rst-order optimality
criterion.
Output 6.9.7 Projected Gradient at Solution
PROC NLP: Nonlinear Minimization
Projected Gradient
Free Projected
Dimension Gradient
1 4.440892E-16
The projected Hessian matrix (shown in Output 6.9.8) is positive denite, satisfying the second-order
optimality criterion.
Output 6.9.8 Projected Hessian at Solution
PROC NLP: Nonlinear Minimization
Projected Hessian
Matrix
X1
X1 1.535309013
References
Abramowitz, M. and Stegun, I. A. (1972), Handbook of Mathematical Functions, New York: Dover
Publications.
Al-Baali, M. and Fletcher, R. (1985), Variational Methods for Nonlinear Least Squares, Journal of
the Operations Research Society, 36, 405421.
Al-Baali, M. and Fletcher, R. (1986), An Efcient Line Search for Nonlinear Least Squares, Journal
of Optimization Theory and Applications, 48, 359377.
Bard, Y. (1974), Nonlinear Parameter Estimation, New York: Academic Press.
Beale, E. M. L. (1972), A Derivation of Conjugate Gradients, in F. A. Lootsma, ed., Numerical
Methods for Nonlinear Optimization, London: Academic Press.
Betts, J. T. (1977), An Accelerated Multiplier Method for Nonlinear Programming, Journal of
Optimization Theory and Applications, 21, 137174.
Bracken, J. and McCormick, G. P. (1968), Selected Applications of Nonlinear Programming, New
York: John Wiley & Sons.
References ! 447
Chamberlain, R. M., Powell, M. J. D., Lemarechal, C., and Pedersen, H. C. (1982), The Watchdog
Technique for Forcing Convergence in Algorithms for Constrained Optimization, Mathematical
Programming, 16, 117.
Cramer, J. S. (1986), Econometric Applications of Maximum Likelihood Methods, Cambridge,
England: Cambridge University Press.
Dennis, J. E., Gay, D. M., and Welsch, R. E. (1981), An Adaptive Nonlinear Least-Squares
Algorithm, ACM Transactions on Mathematical Software, 7, 348368.
Dennis, J. E. and Mei, H. H. W. (1979), Two New Unconstrained Optimization Algorithms Which
Use Function and Gradient Values, Journal of Optimization Theory Applications, 28, 453482.
Dennis, J. E. and Schnabel, R. B. (1983), Numerical Methods for Unconstrained Optimization and
Nonlinear Equations, Englewood, NJ: Prentice-Hall.
Eskow, E. and Schnabel, R. B. (1991), Algorithm 695: Software for a New Modied Cholesky
Factorization, ACM Transactions on Mathematical Software, 17, 306312.
Fletcher, R. (1987), Practical Methods of Optimization, Second Edition, Chichester, UK: John Wiley
& Sons.
Fletcher, R. and Powell, M. J. D. (1963), A Rapidly Convergent Descent Method for Minimization,
Computer Journal, 6, 163168.
Fletcher, R. and Xu, C. (1987), Hybrid Methods for Nonlinear Least Squares, Journal of Numerical
Analysis, 7, 371389.
Gallant, A. R. (1987), Nonlinear Statistical Models, New York: John Wiley & Sons.
Gay, D. M. (1983), Subroutines for Unconstrained Minimization, ACM Transactions on Mathemat-
ical Software, 9, 503524.
George, J. A. and Liu, J. W. (1981), Computer Solutions of Large Sparse Positive Denite Systems,
Englewood Cliffs, NJ: Prentice-Hall.
Gill, E. P., Murray, W., Saunders, M. A., and Wright, M. H. (1983), Computing Forward-Difference
Intervals for Numerical Optimization, SIAM J. Sci. Stat. Comput., 4, 310321.
Gill, E. P., Murray, W., Saunders, M. A., and Wright, M. H. (1984), Procedures for Optimization
Problems with a Mixture of Bounds and General Linear Constraints, ACM Transactions on
Mathematical Software, 10, 282298.
Gill, E. P., Murray, W., and Wright, M. H. (1981), Practical Optimization, New York: Academic
Press.
Goldfeld, S. M., Quandt, R. E., and Trotter, H. F. (1966), Maximisation by Quadratic Hill-Climbing,
Econometrica, 34, 541551.
Hambleton, R. K., Swaminathan, H., and Rogers, H. J. (1991), Fundamentals of Item Response
Theory, Newbury Park, CA: Sage Publications.
Hartmann, W. (1992a), Applications of Nonlinear Optimization with PROC NLP and SAS/IML
Software, Technical report, SAS Institute Inc, Cary, NC.
448 ! Chapter 6: The NLP Procedure
Hartmann, W. (1992b), Nonlinear Optimization in IML, Releases 6.08, 6.09, 6.10, Technical report,
SAS Institute Inc., Cary, NC.
Haverly, C. A. (1978), Studies of the Behavior of Recursion for the Pooling Problem, SIGMAP
Bulletin, Association for Computing Machinery.
Hock, W. and Schittkowski, K. (1981), Test Examples for Nonlinear Programming Codes, volume
187 of Lecture Notes in Economics and Mathematical Systems, Berlin-Heidelberg-New York:
Springer-Verlag.
Jennrich, R. I. and Sampson, P. F. (1968), Application of Stepwise Regression to Nonlinear
Estimation, Technometrics, 10, 6372.
Lawless, J. F. (1982), Statistical Methods and Methods for Lifetime Data, New York: John Wiley &
Sons.
Liebman, J., Lasdon, L., Schrage, L., and Waren, A. (1986), Modeling and Optimization with GINO,
California: The Scientic Press.
Lindstrm, P. and Wedin, P. A. (1984), A New Line-Search Algorithm for Nonlinear Least-Squares
Problems, Mathematical Programming, 29, 268296.
Mor, J. J. (1978), The Levenberg-Marquardt Algorithm: Implementation and Theory, in G. A.
Watson, ed., Lecture Notes in Mathematics, volume 30, 105116, Berlin-Heidelberg-New York:
Springer-Verlag.
Mor, J. J., Garbow, B. S., and Hillstrom, K. E. (1981), Testing Unconstrained Optimization
Software, ACM Transactions on Mathematical Software, 7, 1741.
Mor, J. J. and Sorensen, D. C. (1983), Computing a Trust-Region Step, SIAM Journal on Scientic
and Statistical Computing, 4, 553572.
Mor, J. J. and Wright, S. J. (1993), Optimization Software Guide, Philadelphia: SIAM.
Murtagh, B. A. and Saunders, M. A. (1983), MINOS 5.0 Users Guide, Technical Report SOL 83-20,
Stanford University.
Nelder, J. A. and Mead, R. (1965), A Simplex Method for Function Minimization, Computer
Journal, 7, 308313.
Polak, E. (1971), Computational Methods in Optimization, New York - San Francisco - London:
Academic Press.
Powell, M. J. D. (1977), Restart Procedures for the Conjugate Gradient Method, Mathematical
Programming, 12, 241254.
Powell, M. J. D. (1978a), Algorithms for Nonlinear Constraints That Use Lagrangian Functions,
Mathematical Programming, 14, 224248.
Powell, M. J. D. (1978b), A Fast Algorithm for Nonlinearly Constrained Optimization Calculations,
in G. A. Watson, ed., Lecture Notes in Mathematics, volume 630, 144175, Berlin-Heidelberg-New
York: Springer-Verlag.
References ! 449
Powell, M. J. D. (1982a), Extensions to Subroutine VF02AD, in R. F. Drenick and F. Kozin,
eds., Systems Modeling and Optimization, Lecture Notes in Control and Information Sciences,
volume 38, 529538, Berlin-Heidelberg-New York: Springer-Verlag.
Powell, M. J. D. (1982b), VMCWD: A Fortran Subroutine for Constrained Optimization, DAMTP
1982/NA4, cambridge, England.
Powell, M. J. D. (1992), A Direct Search Optimization Method That Models the Objective and
Constraint Functions by Linear Interpolation, DAMTP/NA5, cambridge, England.
Rosenbrock, H. H. (1960), An Automatic Method for Finding the Greatest or Least Value of a
Function, Computer Journal, 3, 175184.
Schittkowski, K. (1980), Nonlinear Programming Codes - Information, Tests, Performance, Lecture
Notes in Economics and Mathematical Systems, 183, BerlinHeidelbergNew York: Springer
Verlag.
Schittkowski, K. (1987), More Test Examples for Nonlinear Programming Codes, volume 282 of
Lecture Notes in Economics and Mathematical Systems, Berlin-Heidelberg-New York: Springer-
Verlag.
Schittkowski, K. and Stoer, J. (1979), A Factorization Method for the Solution of Constrained
Linear Least Squares Problems Allowing Subsequent Data Changes, Numerische Mathematik,
31, 431463.
Stewart, G. W. (1967), A Modication of Davidons Minimization Method to Accept Difference
Approximations of Derivatives, J. Assoc. Comput. Mach., 14, 7283.
Wedin, P. A. and Lindstrm, P. (1987), Methods and Software for Nonlinear Least Squares Problems,
University of Umea, Report No. UMINF 133.87.
Whitaker, D., Triggs, C. M., and John, J. A. (1990), Construction of Block Designs Using Mathe-
matical Programming, J. R. Statist. Soc. B, 52, 497503.
Wolfe, P. (1982), Checking the Calculation of Gradients, ACM Transactions on Mathematical
Software, 8, 337343.
450
Chapter 7
The NETFLOW Procedure
Contents
Overview: NETFLOW Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . 453
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453
Network Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454
Side Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
Advantages of Network Models over LP Models . . . . . . . . . . . . . . . . 461
Mathematical Description of NPSC . . . . . . . . . . . . . . . . . . . . . . . 461
Flow Conservation Constraints . . . . . . . . . . . . . . . . . . . . . . . . 462
Nonarc Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462
Warm Starts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463
Getting Started: NETFLOW Procedure . . . . . . . . . . . . . . . . . . . . . . . 464
Introductory Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465
Syntax: NETFLOW Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 470
Functional Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471
Interactivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 476
PROC NETFLOW Statement . . . . . . . . . . . . . . . . . . . . . . . . . 478
CAPACITY Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491
COEF Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 492
COLUMN Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 492
CONOPT Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 492
COST Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493
DEMAND Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493
HEADNODE Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493
ID Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 494
LO Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 494
MULT Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 494
NAME Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495
NODE Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495
PIVOT Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495
PRINT Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495
QUIT Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503
RESET Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503
RHS Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524
ROW Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525
RUN Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525
452 ! Chapter 7: The NETFLOW Procedure
SAVE Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525
SHOW Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527
SUPDEM Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 531
SUPPLY Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 532
TAILNODE Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 532
TYPE Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 532
VAR Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534
Details: NETFLOW Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534
Input Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534
Output Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543
Converting Any PROC NETFLOW Format to an MPS-Format SAS Data Set 546
Case Sensitivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547
Loop Arcs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547
Multiple Arcs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 548
Pricing Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 548
Dual Variables, Reduced Costs, and Status . . . . . . . . . . . . . . . . . . 552
The Working Basis Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . 553
Flow and Value Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . 554
Tightening Bounds and Side Constraints . . . . . . . . . . . . . . . . . . . 554
Reasons for Infeasibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555
Missing S Supply and Missing D Demand Values . . . . . . . . . . . . . . . 556
Balancing Total Supply and Total Demand . . . . . . . . . . . . . . . . . . . 561
Warm Starts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 562
How to Make the Data Read of PROC NETFLOW More Efcient . . . . . . 566
Macro Variable _ORNETFL . . . . . . . . . . . . . . . . . . . . . . . . . . 572
Memory Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573
The Interior Point Algorithm: NETFLOW Procedure . . . . . . . . . . . . . . . . 574
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 574
Network Models: Interior Point Algorithm . . . . . . . . . . . . . . . . . . 575
Linear Programming Models: Interior Point Algorithm . . . . . . . . . . . . 586
Generalized Networks: NETFLOW Procedure . . . . . . . . . . . . . . . . . . . . 609
What Is a Generalized Network? . . . . . . . . . . . . . . . . . . . . . . . 609
How to Specify Data for Arc Multipliers . . . . . . . . . . . . . . . . . . . 610
Using the New EXCESS= Option in Pure Networks: NETFLOW Procedure . . . . 614
Handling Excess Supply or Demand . . . . . . . . . . . . . . . . . . . . . . 614
Handling Missing Supply and Demand Simultaneously . . . . . . . . . . . . 616
Maximum Flow Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617
Handling Supply and Demand Ranges . . . . . . . . . . . . . . . . . . . . . 620
Using the New EXCESS= Option in Generalized Networks: NETFLOW Procedure . 621
How Generalized Networks Differ from Pure Networks . . . . . . . . . . . 622
The EXCESS=SUPPLY Option . . . . . . . . . . . . . . . . . . . . . . . . 622
The EXCESS=DEMAND Option . . . . . . . . . . . . . . . . . . . . . . . 624
Examples: NETFLOW Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . 626
Example 7.1: Shortest Path Problem . . . . . . . . . . . . . . . . . . . . . 626
Overview: NETFLOW Procedure ! 453
Example 7.2: Minimum Cost Flow Problem . . . . . . . . . . . . . . . . . 629
Example 7.3: Using a Warm Start . . . . . . . . . . . . . . . . . . . . . . . 633
Example 7.4: Production, Inventory, Distribution Problem . . . . . . . . . . 635
Example 7.5: Using an Unconstrained Solution Warm Start . . . . . . . . . 648
Example 7.6: Adding Side Constraints, Using a Warm Start . . . . . . . . . 658
Example 7.7: Using a Constrained Solution Warm Start . . . . . . . . . . . . 671
Example 7.8: Nonarc Variables in the Side Constraints . . . . . . . . . . . . 682
Example 7.9: Pure Networks: Using the EXCESS= Option . . . . . . . . . 694
Example 7.10: Maximum Flow Problem . . . . . . . . . . . . . . . . . . . 699
Example 7.11: Generalized Networks: Using the EXCESS= Option . . . . . 702
Example 7.12: Generalized Networks: Maximum Flow Problem . . . . . . 705
Example 7.13: Machine Loading Problem . . . . . . . . . . . . . . . . . . . 707
Example 7.14: Generalized Networks: Distribution Problem . . . . . . . . . 710
Example 7.15: Converting to an MPS-Format SAS Data Set . . . . . . . . . 713
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 715
Overview: NETFLOW Procedure
Introduction
Constrained network models can be used to describe a wide variety of real-world applications ranging
from production, inventory, and distribution problems to nancial applications. These problems can
be solved with the NETFLOW procedure.
These models are conceptually easy since they are based on network diagrams that represent
the problem pictorially. PROC NETFLOW accepts the network specication in a format that is
particularly suited to networks. This not only simplies problem description but also aids in the
interpretation of the solution.
Certain algebraic features of networks are exploited by a specialized version of the simplex method
so that solution times are reduced. Another optimization algorithm, the interior point algorithm, has
been implemented in PROC NETFLOW and can be used as an alternative to the simplex algorithm
to solve network problems.
Should PROC NETFLOW detect there are no arcs and nodes in the models data, (that is, there is no
network component), it assumes it is dealing with a linear programming (LP) problem. The interior
point algorithm is automatically selected to perform the optimization.
You can also solve LP problems by using the OPTLP procedure. The OPTLP procedure requires
a linear program to be specied by using a SAS data set that adheres to the MPS format, a widely
accepted format in the optimization community. You can use the MPSOUT= option in the NETFLOW
procedure to convert typical PROC NETFLOW format data sets into MPS-format SAS data sets.
454 ! Chapter 7: The NETFLOW Procedure
Network Models
A network consists of a collection of nodes joined by a collection of arcs. The arcs connect nodes
and convey ow of one or more commodities that are supplied at supply nodes and demanded at
demand nodes in the network. Each arc has a cost per unit of ow, a ow capacity, and a lower ow
bound associated with it. An important concept in network modeling is conservation of ow.
Conservation of ow means that the total ow in arcs directed toward a node, plus the supply at the
node, minus the demand at the node, equals the total ow in arcs directed away from the node.
A network and its associated data can be described in SAS data sets. PROC NETFLOW uses this
description and nds the ow through each arc in the network that minimizes the total cost of ow,
meets the demand at demand nodes using the supply at supply nodes so that the ow through each
arc is on or between the arcs lower ow bound and its capacity, and satises the conservation of
ow.
One class of network models is the production-inventory-distribution problem. The diagram in
Figure 7.1 illustrates this problem. The subscripts on the Production, Inventory, and Sales nodes
indicate the time period. Notice that if you replicate sections of the model, the notion of time can be
included.
Figure 7.1 Production-Inventory-Distribution Problem
Sales
i-1
Sales
i
Sales
i1
Inventory
i-1
Inventory
i
Inventory
i1
Production
i-1
Production
i
Production
i1
Stock on hand Stock at end
In this type of model, the nodes can represent a wide variety of facilities. Several examples are
suppliers, spot markets, importers, farmers, manufacturers, factories, parts of a plant, production
lines, waste disposal facilities, workstations, warehouses, coolstores, depots, wholesalers, export
markets, ports, rail junctions, airports, road intersections, cities, regions, shops, customers, and
consumers. The diversity of this selection demonstrates the richness of potential applications of this
model.
Depending upon the interpretation of the nodes, the objectives of the modeling exercise can vary
widely. Some common types of objectives are
Side Constraints ! 455
v to reduce collection or purchase costs of raw materials
v to reduce inventory holding or backorder costs. Warehouses and other storage facilities
sometimes have capacities, and there can be limits on the amount of goods that can be placed
on backorder.
v to decide where facilities should be located and what the capacity of these should be. Network
models have been used to help decide where factories, hospitals, ambulance and re stations,
oil and water wells, and schools should be sited.
v to determine the assignment of resources (machines, production capability, workforce) to tasks,
schedules, classes, or les
v to determine the optimal distribution of goods or services. This usually means minimizing
transportation costs, and reducing time in transit or distances covered.
v to nd the shortest path from one location to another
v to ensure that demands (for example, production requirements, market demands, contractual
obligations) are met
v to maximize prots from the sale of products or the charge for services
v to maximize production by identifying bottlenecks
Some specic applications are
v car distribution models. These help determine which models and numbers of cars should be
manufactured in which factories and where to distribute cars from these factories to zones in
the United States in order to meet customer demand at least cost.
v models in the timber industry. These help determine when to plant and mill forests, schedule
production of pulp, paper and wood products, and distribute products for sale or export.
v military applications. The nodes can be theatres, bases, ammunition dumps, logistical suppliers,
or radar installations. Some models are used to nd the best ways to mobilize personnel and
supplies and to evacuate the wounded in the least amount of time.
v communications applications. The nodes can be telephone exchanges, transmission lines,
satellite links, and consumers. In a model of an electrical grid, the nodes can be transformers,
powerstations, watersheds, reservoirs, dams, and consumers. Of concern might be the effect of
high loads or outages.
Side Constraints
Often all the details of a problem cannot be specied in a network model alone. In many of
these cases, these details can be represented by the addition of side constraints to the model. Side
constraints are a linear function of arc variables (variables containing ow through an arc) and
456 ! Chapter 7: The NETFLOW Procedure
nonarc variables (variables that are not part of the network). This enhancement to the basic network
model allows for very general problems. In fact, any linear program can be represented with network
models having these types of side constraints. The examples that follow help to clarify the notion of
side constraints.
PROC NETFLOW enables you to specify side constraints. The data for a side constraint consist
of coefcients of arcs and coefcients of nonarc variables, a constraint type (that is, _, =, or _)
and a right-hand-side value (rhs). A nonarc variable has a name, an objective function coefcient
analogous to an arc cost, an upper bound analogous to an arc capacity, and a lower bound analogous
to an arc lower ow bound. PROC NETFLOW nds the ow through the network and the values
of any nonarc variables that minimize the total cost of the solution. Flow conservation is met, ow
through each arc is on or between the arcs lower ow bound and capacity, the value of each nonarc
variable is on or between the nonarcs lower and upper bounds, and the side constraints are satised.
Note that, since many linear programs have large embedded networks, PROC NETFLOW is an
attractive alternative to the LP procedure in many cases.
In order for arcs to be specied in side constraints, they must be named. By default, PROC
NETFLOW names arcs using the names of the nodes at the head and tail of the arc. An arc is named
with its tail node name followed by an underscore and its head node name. For example, an arc from
node from to node to is called from_to.
Proportionality Constraints
Side constraints in network models fall into several categories that have special structure. They are
frequently used when the ow through an arc must be proportional to the ow through another arc.
Such constraints are called proportionality constraints and are useful in models where production is
subject to rening or modication into different materials. The amount of each output, or any waste,
evaporation, or reduction can be specied as a proportion of input.
Typically the arcs near the supply nodes carry raw materials and the arcs near the demand nodes
carry rened products. For example, in a model of the milling industry, the ow through some arcs
may represent quantities of wheat. After the wheat is processed, the ow through other arcs might be
our. For others it might be bran. The side constraints model the relationship between the amount of
our or bran produced as a proportion of the amount of wheat milled. Some of the wheat can end up
as neither our, bran, nor any useful product, so this waste is drained away via arcs to a waste node.
Side Constraints ! 457
Figure 7.2 Proportionality Constraints
Wheat Mill
Flour
Bran
Other
1.0 0.2
0.3
0.5
Consider the network fragment in Figure 7.2. The arc Wheat_Mill conveys the wheat milled. The
cost of ow on this arc is the milling cost. The capacity of this arc is the capacity of the mill. The
lower ow bound on this arc is the minimum quantity that must be milled for the mill to operate
economically. The constraints
0.3 Wheat_Mill Mill_Flour = 0.0
0.2 Wheat_Mill Mill_Bran = 0.0
force every unit of wheat that is milled to produce 0.3 units of our and 0.2 units of bran. Note that
it is not necessary to specify the constraint
0.5 Wheat_Mill Mill_Other = 0.0
since ow conservation implies that any ow that does not traverse through Mill_Flour or Mill_Bran
must be conveyed through Mill_Other. And, computationally, it is better if this constraint is not
specied, since there is one less side constraint and fewer problems with numerical precision. Notice
that the sum of the proportions must equal 1.0 exactly; otherwise, ow conservation is violated.
Blending Constraints
Blending or quality constraints can also inuence the recipes or proportions of ingredients that are
mixed. For example, different raw materials can have different properties. In an application of the
oil industry, the amount of products that are obtained could be different for each type of crude oil.
Furthermore, fuel might have a minimum octane requirement or limited sulphur or lead content, so
that a blending of crudes is needed to produce the product.
The network fragment in Figure 7.3 shows an example of this.
458 ! Chapter 7: The NETFLOW Procedure
Figure 7.3 Blending Constraints
USA
MidEast
Port Renery
Gasoline
Diesel
Other
5 units/
liter
4 units/
liter
4.75 units/
liter
The arcs MidEast_Port and USA_Port convey crude oil from the two sources. The arc Port_Renery
represents rening while the arcs Renery_Gasoline and Renery_Diesel carry the gas and diesel
produced. The proportionality constraints
0.4 Port_Renery Renery_Gasoline = 0.0
0.2 Port_Renery Renery_Diesel = 0.0
capture the restrictions for producing gasoline and diesel from crude. Suppose that, if only crude
from the Middle East is used, the resulting diesel would contain 5 units of sulphur per liter. If only
crude from the USA is used, the resulting diesel would contain 4 units of sulphur per liter. Diesel
can have at most 4.75 units of sulphur per liter. Some crude from the USA must be used if Middle
East crude is used in order to meet the 4.75 sulphur per liter limit. The side constraint to model this
requirement is
5 MidEast_Port 4 USA_Port 4.75 Port_Renery _ 0.0
Since Port_Renery = MidEast_Port USA_Port, ow conservation allows this constraint to be
simplied to
1 MidEast_Port 3 USA_Port _ 0.0
If, for example, 120 units of crude from the Middle East is used, then at least 40 units of crude from
the USA must be used. The preceding constraint is simplied because you assume that the sulphur
concentration of diesel is proportional to the sulphur concentration of the crude mix. If this is not the
case, the relation
0.2 Port_Renery =Renery_Diesel
is used to obtain
Side Constraints ! 459
5 MidEast_Port 4 USA_Port 4.75 (1.0,0.2 Renery_Diesel) _ 0.0
which equals
5 MidEast_Port 4 USA_Port 23.75 Renery_Diesel _ 0.0
An example similar to this Oil Industry problem is solved in the section Introductory Example on
page 465.
Multicommodity Problems
Side constraints are also used in models in which there are capacities on transportation or some
other shared resource, or there are limits on overall production or demand in multicommodity,
multidivisional or multiperiod problems. Each commodity, division or period can have a separate
network coupled to one main system by the side constraints. Side constraints are used to combine
the outputs of subdivisions of a problem (either commodities, outputs in distinct time periods, or
different process streams) to meet overall demands or to limit overall production or expenditures.
This method is more desirable than doing separate local optimizations for individual commodity,
process, or time networks and then trying to establish relationships between each when determining
an overall policy if the global constraint is not satised. Of course, to make models more realistic,
side constraints may be necessary in the local problems.
Figure 7.4 Multicommodity Problem
Factorycom2
Factorycom1
City2com2
City1com2
City2com1
City1com1
Commodity 1
Commodity 2
Figure 7.4 shows two network fragments. They represent identical production and distribution sites
of two different commodities. Sufx com1 represents commodity 1 and sufx com2 represents
commodity 2. The nodes Factorycom1 and Factorycom2 model the same factory, and nodes City1com1
and City1com2 model the same location, city 1. Similarly, City2com1 and City2com2 are the same
460 ! Chapter 7: The NETFLOW Procedure
location, city 2. Suppose that commodity 1 occupies 2 cubic meters, commodity 2 occupies 3 cubic
meters, the truck dispatched to city 1 has a capacity of 200 cubic meters, and the truck dispatched to
city 2 has a capacity of 250 cubic meters. How much of each commodity can be loaded onto each
truck? The side constraints for this case are
2 Factorycom1_City1com1 3 Factorycom2_City1com2 _ 200
2 Factorycom1_City2com1 3 Factorycom2_City2com2 _ 250
Large Modeling Strategy
In many cases, the ow through an arc might actually represent the ow or movement of a commodity
from place to place or from time period to time period. However, sometimes an arc is included in the
network as a method of capturing some aspect of the problem that you would not normally think of
as part of a network model. For example, in a multiprocess, multiproduct model (Figure 7.5), there
might be subnetworks for each process and each product. The subnetworks can be joined together
by a set of arcs that have ows that represent the amount of product produced by process i . To
model an upper limit constraint on the total amount of product that can be produced, direct all arcs
carrying product to a single node and from there through a single arc. The capacity of this arc is
the upper limit of product production. It is preferable to model this structure in the network rather
than to include it in the side constraints because the efciency of the optimizer is affected less by a
reasonable increase in the size of the network.
Figure 7.5 Multiprocess, Multiproduct Example
Capacity of
Process 2
Process 2 subnetwork
Capacity of
Process 1
Process 1 subnetwork
Capacity is upper limit of
Product 2 production
Product 2 subnetwork
Capacity is upper limit of
Product 1 production
Product 1 subnetwork
It is often a good strategy when starting a project to use a small network formulation and then use
that model as a framework upon which to add detail. For example, in the multiprocess, multiproduct
model, you might start with the network depicted in Figure 7.5. Then, for example, the process
subnetwork can be enhanced to include the distribution of products. Other phases of the operation
could be included by adding more subnetworks. Initially, these subnetworks can be single nodes,
but in subsequent studies they can be expanded to include greater detail. The NETFLOW procedure
accepts the side constraints in the same dense and sparse formats that the LP procedure provides.
Although PROC LP can solve network problems, the NETFLOW procedure generally solves network
ow problems more efciently than PROC LP.
Advantages of Network Models over LP Models ! 461
Advantages of Network Models over LP Models
Many linear programming problems have large embedded network structures. Such problems often
result when modeling manufacturing processes, transportation or distribution networks, or resource
allocation, or when deciding where to locate facilities. Often, some commodity is to be moved from
place to place, so the more natural formulation in many applications is that of a constrained network
rather than a linear program.
Using a network diagram to visualize a problem makes it possible to capture the important relation-
ships in an easily understood picture form. The network diagram aids the communication between
model builder and model user, making it easier to comprehend how the model is structured, how it
can be changed, and how results can be interpreted.
If a network structure is embedded in a linear program, the problem is a network programming
problem with side constraints (NPSC). When the network part of the problem is large compared to
the nonnetwork part, especially if the number of side constraints is small, it is worthwhile to exploit
this structure in the solution process. This is what PROC NETFLOW does. It uses a variant of the
revised primal simplex algorithm that exploits the network structure to reduce solution time.
Mathematical Description of NPSC
If a network programming problem with side constraints has n nodes, a arcs, g nonarc variables, and
k side constraints, then the formal statement of the problem solved by PROC NETFLOW is
minimize c
T
. J
T
:
subject to J. = b
H. Q: _. =. _ r
l _ . _ u
m _ : _
where
v c is the a 1 arc variable objective function coefcient vector (the cost vector)
v . is the a 1 arc variable value vector (the ow vector)
v J is the g 1 nonarc variable objective function coefcient vector
v : is the g 1 nonarc variable value vector
v J is the n a node-arc incidence matrix of the network, where
J
i,}
=
_
_
_
1. if arc is directed from node i
1. if arc is directed toward node i
0. otherwise
462 ! Chapter 7: The NETFLOW Procedure
v b is the n 1 node supply/demand vector, where
b
i
=
_
_
_
s. if node i has supply capability of s units of ow
J. if node i has demand of J units of ow
0. if node i is a trans-shipment node
v H is the k a side constraint coefcient matrix for arc variables, where H
i,}
is the coefcient
of arc in the i th side constraint
v Q is the k g side constraint coefcient matrix for nonarc variables, where Q
i,}
is the
coefcient of nonarc in the i th side constraint
v r is the k 1 side constraint right-hand-side vector
v l is the a 1 arc lower ow bound vector
v u is the a 1 arc capacity vector
v m is the g 1 nonarc variable lower bound vector
v is the g 1 nonarc variable upper bound vector
Flow Conservation Constraints
The constraints J. = b are referred to as the nodal ow conservation constraints. These constraints
algebraically state that the sum of the ow through arcs directed toward a node plus that nodes
supply, if any, equals the sum of the ow through arcs directed away from that node plus that nodes
demand, if any. The ow conservation constraints are implicit in the network model and should
not be specied explicitly in side constraint data when using PROC NETFLOW. The constrained
problems most amenable to being solved by the NETFLOW procedure are those that, after the
removal of the ow conservation constraints, have very few constraints. PROC NETFLOW is
superior to linear programming optimizers when the network part of the problem is signicantly
larger than the nonnetwork part.
The NETFLOW procedure can also be used to solve an unconstrained network problem, that is, one
in which H, Q, J, r, and : do not exist.
Nonarc Variables
If the constrained problem to be solved has no nonarc variables, then Q, J, and : do not exist.
However, nonarc variables can be used to simplify side constraints. For example, if a sum of ows
appears in many constraints, it may be worthwhile to equate this expression with a nonarc variable
and use this in the other constraints. By assigning a nonarc variable a nonzero objective function, it is
then possible to incur a cost for using resources above some lowest feasible limit. Similarly, a prot
(a negative objective function coefcient value) can be made if all available resources are not used.
Warm Starts ! 463
In some models, nonarc variables are used in constraints to absorb excess resources or supply needed
resources. Then, either the excess resource can be used or the needed resource can be supplied to
another component of the model.
For example, consider a multicommodity problem of making television sets that have either 19- or
25-inch screens. In their manufacture, 3 and 4 chips, respectively, are used. Production occurs at
2 factories during March and April. The supplier of chips can supply only 2600 chips to factory 1
and 3750 chips to factory 2 each month. The names of arcs are in the form Prodn_s_m, where n
is the factory number, s is the screen size, and m is the month. For example, Prod1_25_Apr is the
arc that conveys the number of 25-inch TVs produced in factory 1 during April. You might have to
determine similar systematic naming schemes for your application.
As described, the constraints are
3 Prod1_19_Mar 4 Prod1_25_Mar _ 2600
3 Prod2_19_Mar 4 Prod2_25_Mar _ 3750
3 Prod1_19_Apr 4 Prod1_25_Apr _ 2600
3 Prod2_19_Apr 4 Prod2_25_Apr _ 3750
If there are chips that could be obtained for use in March but not used for production in March, why
not keep these unused chips until April? Furthermore, if the March excess chips at factory 1 could
be used either at factory 1 or factory 2 in April, the model becomes
3 Prod1_19_Mar 4 Prod1_25_Mar F1_Unused_Mar = 2600
3 Prod2_19_Mar 4 Prod2_25_Mar F2_Unused_Mar = 3750
3 Prod1_19_Apr 4 Prod1_25_Apr F1_Kept_Since_Mar = 2600
3 Prod2_19_Apr 4 Prod2_25_Apr F2_Kept_Since_Mar = 3750
F1_Unused_Mar F2_Unused_Mar (continued)
F1_Kept_Since_Mar F2_Kept_Since_Mar _ 0.0
where F1_Kept_Since_Mar is the number of chips used during April at factory 1 that were obtained in
March at either factory 1 or factory 2 and F2_Kept_Since_Mar is the number of chips used during
April at factory 2 that were obtained in March. The last constraint ensures that the number of chips
used during April that were obtained in March does not exceed the number of chips not used in
March. There may be a cost to hold chips in inventory. This can be modeled having a positive
objective function coefcient for the nonarc variables F1_Kept_Since_Mar and F2_Kept_Since_Mar.
Moreover, nonarc variable upper bounds represent an upper limit on the number of chips that can be
held in inventory between March and April.
See Example 7.4 through Example 7.8 for a series of examples that use this TV problem. The use of
nonarc variables as described previously is illustrated.
Warm Starts
If you have a problem that has already been partially solved and is to be solved further to obtain a
better, optimal solution, information describing the solution now available may be used as an initial
464 ! Chapter 7: The NETFLOW Procedure
solution. This is called warm starting the optimization, and the supplied solution data are called the
warm start.
Some data can be changed between the time when a warm start is created and when it is used as a
warm start for a subsequent PROC NETFLOW run. Elements in the arc variable cost vector, the
nonarc variable objective function coefcient vector, and sometimes capacities, upper value bounds,
and side constraint data can be changed between PROC NETFLOW calls. See the section Warm
Starts on page 562. Also, see Example 7.4 through Example 7.8 (the TV problem) for a series of
examples that show the use of warm starts.
Getting Started: NETFLOW Procedure
To solve network programming problems with side constraints using PROC NETFLOW, you save a
representation of the network and the side constraints in three SAS data sets. These data sets are then
passed to PROC NETFLOW for solution. There are various forms that a problems data can take.
You can use any one or a combination of several of these forms.
The NODEDATA= data set contains the names of the supply and demand nodes and the supply or
demand associated with each. These are the elements in the column vector b in problem (NPSC).
The ARCDATA= data set contains information about the variables of the problem. Usually these are
arcs, but there can also be data related to nonarc variables in the ARCDATA= data set.
An arc is identied by the names of its tail node (where it originates) and head node (where it is
directed). Each observation can be used to identify an arc in the network and, optionally, the cost per
ow unit across the arc, the arcs capacity, lower ow bound, and name. These data are associated
with the matrix J and the vectors c, l, and u in problem (NPSC).
NOTE: Although J is a node-arc incidence matrix, it is specied in the ARCDATA= data set by arc
denitions.
In addition, the ARCDATA= data set can be used to specify information about nonarc variables,
including objective function coefcients, lower and upper value bounds, and names. These data are
the elements of the vectors J, m, and in problem ( NPSC). Data for an arc or nonarc variable can
be given in more than one observation.
Supply and demand data also can be specied in the ARCDATA= data set. In such a case, the
NODEDATA= data set may not be needed.
The CONDATA= data set describes the side constraints and their right-hand sides. These data are
elements of the matrices H and Q and the vector r. Constraint types are also specied in the
CONDATA= data set. You can include in this data set upper bound values or capacities, lower ow
or value bounds, and costs or objective function coefcients. It is possible to give all information
about some or all nonarc variables in the CONDATA= data set.
An arc is identied in this data set by its name. If you specify an arcs name in the ARCDATA= data
set, then this name is used to associate data in the CONDATA= data set with that arc. Each arc also
has a default name that is the name of the tail and head node of the arc concatenated together and
separated by an underscore character; tail_head, for example.
Introductory Example ! 465
If you use the dense side constraint input format (described in the section CONDATA= Data Set
on page 535) and want to use the default arc names, these arc names are names of SAS variables in
the VAR list of the CONDATA= data set.
If you use the sparse side constraint input format (see the section CONDATA= Data Set on
page 535) and want to use the default arc names, these arc names are values of the COLUMN list
SAS variable of the CONDATA= data set.
The execution of PROC NETFLOW has three stages. In the preliminary (zeroth) stage, the data are
read from the NODEDATA= data set, the ARCDATA= data set, and the CONDATA= data set. Error
checking is performed, and an initial basic feasible solution is found. If an unconstrained solution
warm start is being used, then an initial basic feasible solution is obtained by reading additional data
containing that information in the NODEDATA= data set and the ARCDATA= data set. In this case,
only constraint data and nonarc variable data are read from the CONDATA= data set.
In the rst stage, an optimal solution to the network ow problem neglecting any side constraints is
found. The primal and dual solutions for this relaxed problem can be saved in the ARCOUT= data
set and the NODEOUT= data set, respectively. These data sets are named in the PROC NETFLOW,
RESET, and SAVE statements.
In the second stage, an optimal solution to the network ow problem with side constraints is found.
The primal and dual solutions for this side constrained problem are saved in the CONOUT= data set
and the DUALOUT= data set, respectively. These data sets are also named in the PROC NETFLOW,
RESET, and SAVE statements.
If a constrained solution warm start is being used, PROC NETFLOW does not perform the zeroth and
rst stages. This warm start can be obtained by reading basis data containing additional information
in the NODEDATA= data set (also called the DUALIN= data set) and the ARCDATA= data set.
If warm starts are to be used in future optimizations, the FUTURE1 and FUTURE2 options must be
used in addition to specifying names for the data sets that contain the primal and dual solutions in
stages one and two. Then, most of the information necessary for restarting problems is available in
the output data sets containing the primal and dual solutions of both the relaxed and side constrained
network programs.
Introductory Example
Consider the following trans-shipment problem for an oil company. Crude oil is shipped to reneries
where it is processed into gasoline and diesel fuel. The gasoline and diesel fuel are then distributed
to service stations. At each stage, there are shipping, processing, and distribution costs. Also, there
are lower ow bounds and capacities.
In addition, there are two sets of side constraints. The rst set is that two times the crude from the
Middle East cannot exceed the throughput of a renery plus 15 units. (The phrase plus 15 units that
nishes the last sentence is used to enable some side constraints in this example to have a nonzero
rhs.) The second set of constraints are necessary to model the situation that one unit of crude mix
processed at a renery yields three-fourths of a unit of gasoline and one-fourth of a unit of diesel
fuel.
466 ! Chapter 7: The NETFLOW Procedure
Because there are two products that are not independent in the way in which they ow through the
network, a network programming problem with side constraints is an appropriate model for this
example (see Figure 7.6). The side constraints are used to model the limitations on the amount of
Middle Eastern crude that can be processed by each renery and the conversion proportions of crude
to gasoline and diesel fuel.
Figure 7.6 Oil Industry Example
u.s.a. renery2
middle east
renery1
r2
r1
ref2 diesel
ref2 gas
ref1 diesel
ref1 gas
servstn2
diesel
servstn2
gas
servstn1
diesel
servstn1
gas
To solve this problem with PROC NETFLOW, save a representation of the model in three SAS data
sets. In the NODEDATA= data set, you name the supply and demand nodes and give the associated
supplies and demands. To distinguish demand nodes from supply nodes, specify demands as negative
quantities. For the oil example, the NODEDATA= data set can be saved as follows:
title 'Oil Industry Example';
title3 'Setting Up Nodedata = Noded For Proc Netflow';
data noded;
input _node_&$15. _sd_;
datalines;
middle east 100
u.s.a. 80
servstn1 gas -95
servstn1 diesel -30
servstn2 gas -40
servstn2 diesel -15
;
Introductory Example ! 467
The ARCDATA= data set contains the rest of the information about the network. Each observation
in the data set identies an arc in the network and gives the cost per ow unit across the arc, the
capacities of the arc, the lower bound on ow across the arc, and the name of the arc.
title3 'Setting Up Arcdata = Arcd1 For Proc Netflow';
data arcd1;
input _from_&$11. _to_&$15. _cost_ _capac_ _lo_ _name_ $;
datalines;
middle east refinery 1 63 95 20 m_e_ref1
middle east refinery 2 81 80 10 m_e_ref2
u.s.a. refinery 1 55 . . .
u.s.a. refinery 2 49 . . .
refinery 1 r1 200 175 50 thruput1
refinery 2 r2 220 100 35 thruput2
r1 ref1 gas . 140 . r1_gas
r1 ref1 diesel . 75 . .
r2 ref2 gas . 100 . r2_gas
r2 ref2 diesel . 75 . .
ref1 gas servstn1 gas 15 70 . .
ref1 gas servstn2 gas 22 60 . .
ref1 diesel servstn1 diesel 18 . . .
ref1 diesel servstn2 diesel 17 . . .
ref2 gas servstn1 gas 17 35 5 .
ref2 gas servstn2 gas 31 . . .
ref2 diesel servstn1 diesel 36 . . .
ref2 diesel servstn2 diesel 23 . . .
;
Finally, the CONDATA= data set contains the side constraints for the model.
title3 'Setting Up Condata = Cond1 For Proc Netflow';
data cond1;
input m_e_ref1 m_e_ref2 thruput1 r1_gas thruput2 r2_gas
_type_ $ _rhs_;
datalines;
-2 . 1 . . . >= -15
. -2 . . 1 . GE -15
. . -3 4 . . EQ 0
. . . . -3 4 = 0
;
Note that the SAS variable names in the CONDATA= data set are the names of arcs given in the
ARCDATA= data set. These are the arcs that have nonzero constraint coefcients in side constraints.
For example, the proportionality constraint that species that one unit of crude at each renery yields
three-fourths of a unit of gasoline and one-fourth of a unit of diesel fuel is given for REFINERY 1 in
the third observation and for REFINERY 2 in the last observation. The third observation requires that
each unit of ow on arc THRUPUT1 equals three-fourths of a unit of ow on arc R1_GAS. Because all
crude processed at REFINERY 1 ows through THRUPUT1 and all gasoline produced at REFINERY 1
ows through R1_GAS, the constraint models the situation. It proceeds similarly for REFINERY 2 in
the last observation.
To nd the minimum cost ow through the network that satises the supplies, demands, and side
constraints, invoke PROC NETFLOW as follows:
468 ! Chapter 7: The NETFLOW Procedure
proc netflow
nodedata=noded /
*
the supply and demand data
*
/
arcdata=arcd1 /
*
the arc descriptions
*
/
condata=cond1 /
*
the side constraints
*
/
conout=solution; /
*
the solution data set
*
/
run;
The following messages, which appear on the SAS log, summarize the model as read by PROC
NETFLOW and note the progress toward a solution:
NOTE: Number of nodes= 14 .
NOTE: Number of supply nodes= 2 .
NOTE: Number of demand nodes= 4 .
NOTE: Total supply= 180 , total demand= 180 .
NOTE: Number of arcs= 18 .
NOTE: Number of iterations performed (neglecting any constraints)= 14 .
NOTE: Of these, 0 were degenerate.
NOTE: Optimum (neglecting any constraints) found.
NOTE: Minimal total cost= 50600 .
NOTE: Number of <= side constraints= 0 .
NOTE: Number of == side constraints= 2 .
NOTE: Number of >= side constraints= 2 .
NOTE: Number of arc and nonarc variable side constraint coefficients= 8 .
NOTE: Number of iterations, optimizing with constraints= 4 .
NOTE: Of these, 0 were degenerate.
NOTE: Optimum reached.
NOTE: Minimal total cost= 50875 .
NOTE: The data set WORK.SOLUTION has 18 observations and 14 variables.
Unlike PROC LP, which displays the solution and other information as output, PROC NETFLOW
saves the optimum in output SAS data sets that you specify. For this example, the solution is saved
in the SOLUTION data set. It can be displayed with the PRINT procedure as
proc print data=solution;
var _from_ _to_ _cost_ _capac_ _lo_ _name_
_supply_ _demand_ _flow_ _fcost_ _rcost_;
sum _fcost_;
title3 'Constrained Optimum';
run;
Introductory Example ! 469
Figure 7.7 CONOUT=SOLUTION
Oil Industry Example
Constrained Optimum
Obs _from_ _to_ _cost_ _capac_ _lo_ _name_
1 refinery 1 r1 200 175 50 thruput1
2 refinery 2 r2 220 100 35 thruput2
3 r1 ref1 diesel 0 75 0
4 r1 ref1 gas 0 140 0 r1_gas
5 r2 ref2 diesel 0 75 0
6 r2 ref2 gas 0 100 0 r2_gas
7 middle east refinery 1 63 95 20 m_e_ref1
8 u.s.a. refinery 1 55 99999999 0
9 middle east refinery 2 81 80 10 m_e_ref2
10 u.s.a. refinery 2 49 99999999 0
11 ref1 diesel servstn1 diesel 18 99999999 0
12 ref2 diesel servstn1 diesel 36 99999999 0
13 ref1 gas servstn1 gas 15 70 0
14 ref2 gas servstn1 gas 17 35 5
15 ref1 diesel servstn2 diesel 17 99999999 0
16 ref2 diesel servstn2 diesel 23 99999999 0
17 ref1 gas servstn2 gas 22 60 0
18 ref2 gas servstn2 gas 31 99999999 0
Obs _SUPPLY_ _DEMAND_ _FLOW_ _FCOST_ _RCOST_
1 . . 145.00 29000.00 .
2 . . 35.00 7700.00 29
3 . . 36.25 0.00 .
4 . . 108.75 0.00 .
5 . . 8.75 0.00 .
6 . . 26.25 0.00 .
7 100 . 80.00 5040.00 .
8 80 . 65.00 3575.00 .
9 100 . 20.00 1620.00 .
10 80 . 15.00 735.00 .
11 . 30 30.00 540.00 .
12 . 30 0.00 0.00 12
13 . 95 68.75 1031.25 .
14 . 95 26.25 446.25 .
15 . 15 6.25 106.25 .
16 . 15 8.75 201.25 .
17 . 40 40.00 880.00 .
18 . 40 0.00 0.00 7
========
50875.00
Notice that, in CONOUT=SOLUTION (Figure 7.7), the optimal ow through each arc in the network
is given in the variable named _FLOW_, and the cost of ow through each arc is given in the variable
_FCOST_.
470 ! Chapter 7: The NETFLOW Procedure
Figure 7.8 Oil Industry Solution
u.s.a. renery2
middle east
renery1
r2
r1
ref2 diesel
ref2 gas
ref1 diesel
ref1 gas
servstn2
diesel
servstn2
gas
servstn1
diesel
servstn1
gas
80
100
15
80
20
65
35
145
8.75
26.25
36.25
108.75
68.75
8.75
30
40 26.25
6.25
95
30
40
15
Syntax: NETFLOW Procedure
Below are statements used in PROC NETFLOW, listed in alphabetical order as they appear in the
text that follows.
Functional Summary ! 471
PROC NETFLOW options ;
CAPACITY variable ;
COEF variables ;
COLUMN variable ;
CONOPT ; ;
COST variable ;
DEMAND variable ;
HEADNODE variable ;
ID variables ;
LO variable ;
NAME variable ;
NODE variable ;
PIVOT ; ;
PRINT options ;
QUIT ; ;
RESET options ;
RHS variables ;
ROW variables ;
RUN ; ;
SAVE options ;
SHOW options ;
SUPDEM variable ;
SUPPLY variable ;
TAILNODE variable ;
TYPE variable ;
VAR variables ;
Functional Summary
The following table outlines the options available for the NETFLOW procedure classied by func-
tion.
Table 7.1 Functional Summary
Description Statement Option
Input Data Set Options:
Arcs input data set PROC NETFLOW ARCDATA=
Nodes input data set PROC NETFLOW NODEDATA=
Constraint input data set PROC NETFLOW CONDATA=
Output Data Set Options:
Unconstrained primal solution data set PROC NETFLOW ARCOUT=
Unconstrained dual solution data set PROC NETFLOW NODEOUT=
Constrained primal solution data set PROC NETFLOW CONOUT=
Constrained dual solution data set PROC NETFLOW DUALOUT=
472 ! Chapter 7: The NETFLOW Procedure
Description Statement Option
Convert sparse or dense format input data set into
MPS format output data set
PROC NETFLOW MPSOUT=
Data Set Read Options:
CONDATA has sparse data format PROC NETFLOW SPARSECONDATA
Default constraint type PROC NETFLOW DEFCONTYPE=
Special COLUMN variable value PROC NETFLOW TYPEOBS=
Special COLUMN variable value PROC NETFLOW RHSOBS=
Used to interpret arc and nonarc variable names PROC NETFLOW NAMECTRL=
No new nonarc variables PROC NETFLOW SAME_NONARC_DATA
No nonarc data in ARCDATA PROC NETFLOW ARCS_ONLY_ARCDATA
Data for an arc found once in ARCDATA PROC NETFLOW ARC_SINGLE_OBS
Data for a constraint found once in CONDATA PROC NETFLOW CON_SINGLE_OBS
Data for a coefcient found once in CONDATA PROC NETFLOW NON_REPLIC=
Data is grouped, exploited during data read PROC NETFLOW GROUPED=
Problem Size Specication Options:
Approximate number of nodes PROC NETFLOW NNODES=
Approximate number of arcs PROC NETFLOW NARCS=
Approximate number of nonarc variables PROC NETFLOW NNAS=
Approximate number of coefcients PROC NETFLOW NCOEFS=
Approximate number of constraints PROC NETFLOW NCONS=
Network Options:
Default arc cost PROC NETFLOW DEFCOST=
Default arc capacity PROC NETFLOW DEFCAPACITY=
Default arc lower ow bound PROC NETFLOW DEFMINFLOW=
Networks only supply node PROC NETFLOW SOURCE=
SOURCEs supply capability PROC NETFLOW SUPPLY=
Networks only demand node PROC NETFLOW SINK=
SINKs demand PROC NETFLOW DEMAND=
Convey excess supply/demand through network PROC NETFLOW THRUNET
Find maximal ow between SOURCE and SINK PROC NETFLOW MAXFLOW
Cost of bypass arc for MAXFLOW problem PROC NETFLOW BYPASSDIVIDE=
Find shortest path from SOURCE to SINK PROC NETFLOW SHORTPATH
Specify generalized networks PROC NETFLOW GENNET
Specify excess demand or supply PROC NETFLOW EXCESS=
Memory Control Options:
Issue memory usage messages to SAS log PROC NETFLOW MEMREP
Number of bytes to use for main memory PROC NETFLOW BYTES=
Proportion of memory for arrays PROC NETFLOW COREFACTOR=
Memory allocated for LU factors PROC NETFLOW DWIA=
Linked list for updated column PROC NETFLOW SPARSEP2
Use 2-dimensional array for basis matrix PROC NETFLOW INVD_2D
Functional Summary ! 473
Description Statement Option
Maximum bytes for a single array PROC NETFLOW MAXARRAYBYTES=
Simplex Options:
Use big-M instead of two-phase method, stage 1 RESET BIGM1
Use Big-M instead of two-phase method, stage 2 RESET BIGM2
Anti-cycling option RESET CYCLEMULT1=
Interchange rst nonkey with leaving key arc RESET INTFIRST
Controls working basis matrix inversions RESET INVFREQ=
Maximum number of L row operations allowed
before refactorization
RESET MAXL=
Maximum number of LU factor column updates RESET MAXLUUPDATES=
Anti-cycling option RESET MINBLOCK1=
Use rst eligible leaving variable, stage 1 RESET LRATIO1
Use rst eligible leaving variable, stage 2 RESET LRATIO2
Negates INTFIRST RESET NOINTFIRST
Negates LRATIO1 RESET NOLRATIO1
Negates LRATIO2 RESET NOLRATIO2
Negates PERTURB1 RESET NOPERTURB1
Anti-cycling option RESET PERTURB1
Controls working basis matrix refactorization RESET REFACTFREQ=
Use two-phase instead of big-M method, stage 1 RESET TWOPHASE1
Use two-phase instead of big-M method, stage 2 RESET TWOPHASE2
Pivot element selection parameter RESET U=
Zero tolerance, stage 1 RESET ZERO1=
Zero tolerance, stage 2 RESET ZERO2=
Zero tolerance, real number comparisons RESET ZEROTOL=
Pricing Options:
Frequency of dual value calculation RESET DUALFREQ=
Pricing strategy, stage 1 RESET PRICETYPE1=
Pricing strategy, stage 2 RESET PRICETYPE2=
Used when P1SCAN=PARTIAL RESET P1NPARTIAL=
Controls search for entering candidate, stage 1 RESET P1SCAN=
Used when P2SCAN=PARTIAL RESET P2NPARTIAL=
Controls search for entering candidate, stage 2 RESET P2SCAN=
Initial queue size, stage 1 RESET QSIZE1=
Initial queue size, stage 2 RESET QSIZE2=
Used when Q1FILLSCAN=PARTIAL RESET Q1FILLNPARTIAL=
Controls scan when lling queue, stage 1 RESET Q1FILLSCAN=
Used when Q2FILLSCAN=PARTIAL RESET Q2FILLNPARTIAL=
Controls scan when lling queue, stage 2 RESET Q2FILLSCAN=
Queue size reduction factor, stage 1 RESET REDUCEQSIZE1=
Queue size reduction factor, stage 2 RESET REDUCEQSIZE2=
Frequency of refreshing queue, stage 1 RESET REFRESHQ1=
Frequency of refreshing queue, stage 2 RESET REFRESHQ2=
474 ! Chapter 7: The NETFLOW Procedure
Description Statement Option
Optimization Termination Options:
Pause after stage 1; dont start stage 2 RESET ENDPAUSE1
Pause when feasible, stage 1 RESET FEASIBLEPAUSE1
Pause when feasible, stage 2 RESET FEASIBLEPAUSE2
Maximum number of iterations, stage 1 RESET MAXIT1=
Maximum number of iterations, stage 2 RESET MAXIT2=
Negates ENDPAUSE1 RESET NOENDPAUSE1
Negates FEASIBLEPAUSE1 RESET NOFEASIBLEPAUSE1
Negates FEASIBLEPAUSE2 RESET NOFEASIBLEPAUSE2
Pause every PAUSE1 iterations, stage 1 RESET PAUSE1=
Pause every PAUSE2 iterations, stage 2 RESET PAUSE2=
Interior Point Algorithm Options:
Use interior point algorithm PROC NETFLOW INTPOINT
Factorization method RESET FACT_METHOD=
Allowed amount of dual infeasibility RESET TOLDINF=
Allowed amount of primal infeasibility RESET TOLPINF=
Allowed total amount of dual infeasibility RESET TOLTOTDINF=
Allowed total amount of primal infeasibility RESET TOLTOTPINF=
Cut-off tolerance for Cholesky factorization RESET CHOLTINYTOL=
Density threshold for Cholesky processing RESET DENSETHR=
Step-length multiplier RESET PDSTEPMULT=
Preprocessing type RESET PRSLTYPE=
Print optimization progress on SAS log RESET PRINTLEVEL2=
Interior Point Stopping Criteria Options:
Maximum number of interior point iterations RESET MAXITERB=
Primal-dual (duality) gap tolerance RESET PDGAPTOL=
Stop because of complementarity RESET STOP_C=
Stop because of duality gap RESET STOP_DG=
Stop because of infeas
b
RESET STOP_IB=
Stop because of infeas
c
RESET STOP_IC=
Stop because of infeas
d
RESET STOP_ID=
Stop because of complementarity RESET AND_STOP_C=
Stop because of duality gap RESET AND_STOP_DG=
Stop because of infeas
b
RESET AND_STOP_IB=
Stop because of infeas
c
RESET AND_STOP_IC=
Stop because of infeas
d
RESET AND_STOP_ID=
Stop because of complementarity RESET KEEPGOING_C=
Stop because of duality gap RESET KEEPGOING_DG=
Stop because of infeas
b
RESET KEEPGOING_IB=
Stop because of infeas
c
RESET KEEPGOING_IC=
Stop because of infeas
d
RESET KEEPGOING_ID=
Stop because of complementarity RESET AND_KEEPGOING_C=
Functional Summary ! 475
Description Statement Option
Stop because of duality gap RESET AND_KEEPGOING_DG=
Stop because of infeas
b
RESET AND_KEEPGOING_IB=
Stop because of infeas
c
RESET AND_KEEPGOING_IC=
Stop because of infeas
d
RESET AND_KEEPGOING_ID=
PRINT Statement Options:
Display everything PRINT PROBLEM
Display arc information PRINT ARCS
Display nonarc variable information PRINT NONARCS
Display variable information PRINT VARIABLES
Display constraint information PRINT CONSTRAINTS
Display information for some arcs PRINT SOME_ARCS
Display information for some nonarc variables PRINT SOME_NONARCS
Display information for some variables PRINT SOME_VARIABLES
Display information for some constraints PRINT SOME_CONS
Display information for some constraints associ-
ated with some arcs
PRINT CON_ARCS
Display information for some constraints associ-
ated with some nonarc variables
PRINT CON_NONARCS
Display information for some constraints associ-
ated with some variables
PRINT CON_VARIABLES
PRINT Statement Qualiers:
Produce a short report PRINT / SHORT
Produce a long report PRINT / LONG
Display arcs/variables with zero ow/value PRINT / ZERO
Display arcs/variables with nonzero ow/value PRINT / NONZERO
Display basic arcs/variables PRINT / BASIC
Display nonbasic arcs/variables PRINT / NONBASIC
SHOW Statement Options:
Show problem, optimization status SHOW STATUS
Show network model parameters SHOW NETSTMT
Show data sets that have been or will be created SHOW DATASETS
Show options that pause optimization SHOW PAUSE
Show simplex algorithm options SHOW SIMPLEX
Show pricing strategy options SHOW PRICING
Show miscellaneous options SHOW MISC
SHOW Statement Qualiers:
Display information only on relevant options SHOW / RELEVANT
Display options for current stage only SHOW / STAGE
Miscellaneous Options:
Innity value PROC NETFLOW INFINITY=
476 ! Chapter 7: The NETFLOW Procedure
Description Statement Option
Scale constraint row, nonarc variable column coef-
cients, or both
PROC NETFLOW SCALE=
Maximization instead of minimization PROC NETFLOW MAXIMIZE
Use warm start solution PROC NETFLOW WARM
All-articial starting solution PROC NETFLOW ALLART
Output complete basis information to ARCOUT=
and NODEOUT= data sets
RESET FUTURE1
Output complete basis information to CONOUT=
and DUALOUT= data sets
RESET FUTURE2
Turn off infeasibility or optimality ags RESET MOREOPT
Negates FUTURE1 RESET NOFUTURE1
Negates FUTURE2 RESET NOFUTURE2
Negates SCRATCH RESET NOSCRATCH
Negates ZTOL1 RESET NOZTOL1
Negates ZTOL2 RESET NOZTOL2
Write optimization time to SAS log RESET OPTIM_TIMER
No stage 1 optimization; do stage 2 optimization RESET SCRATCH
Suppress similar SAS log messages RESET VERBOSE=
Use zero tolerance, stage 1 RESET ZTOL1
Use zero tolerance, stage 2 RESET ZTOL2
Interactivity
PROC NETFLOW can be used interactively. You begin by giving the PROC NETFLOW statement,
and you must specify the ARCDATA= data set. The CONDATA= data set must also be specied if
the problem has side constraints. If necessary, specify the NODEDATA= data set.
The variable lists should be given next. If you have variables in the input data sets that have special
names (for example, a variable in the ARCDATA= data set named _TAIL_ that has tail nodes of arcs
as values), it may not be necessary to have many or any variable lists.
The CONOPT, PIVOT, PRINT, QUIT, SAVE, SHOW, RESET, and RUN statements follow and can
be listed in any order. The CONOPT and QUIT statements can be used only once. The others can be
used as many times as needed.
Use the RESET or SAVE statement to change the names of the output data sets. With RESET,
you can also indicate the reasons why optimization should stop (for example, you can indicate the
maximum number of stage 1 or stage 2 iterations that can be performed). PROC NETFLOW then has
a chance to either execute the next statement, or, if the next statement is one that PROC NETFLOW
does not recognize (the next PROC or DATA step in the SAS session), do any allowed optimization
and nish. If no new statement has been submitted, you are prompted for one. Some options of the
RESET statement enable you to control aspects of the primal simplex algorithm. Specifying certain
Interactivity ! 477
values for these options can reduce the time it takes to solve a problem. Note that any of the RESET
options can be specied in the PROC NETFLOW statement.
The RUN statement starts or resumes optimization. The PIVOT statement makes PROC NETFLOW
perform one simplex iteration. The QUIT statement immediately stops PROC NETFLOW. The
CONOPT statement forces PROC NETFLOW to consider constraints when it next performs opti-
mization. The SAVE statement has options that enable you to name output data sets; information
about the current solution is put in these output data sets. Use the SHOW statement if you want to
examine the values of options of other statements. Information about the amount of optimization
that has been done and the STATUS of the current solution can also be displayed using the SHOW
statement.
The PRINT statement instructs PROC NETFLOW to display parts of the problem. PRINT ARCS
produces information on all arcs. PRINT SOME_ARCS limits this output to a subset of arcs. There
are similar PRINT statements for nonarc variables and constraints:
print nonarcs;
print some_nonarcs;
print constraints;
print some_cons;
PRINT CON_ARCS enables you to limit constraint information that is obtained to members of a set
of arcs that have nonzero constraint coefcients in a set of constraints. PRINT CON_NONARCS is
the corresponding statement for nonarc variables.
For example, an interactive PROC NETFLOW run might look something like this:
proc netflow arcdata=data set
other options;
variable list specifications; /
*
if necessary
*
/
reset options;
print options; /
*
look at problem
*
/
run; /
*
do some optimization
*
/
/
*
suppose that optimization stopped for
*
/
/
*
some reason or you manually stopped it
*
/
print options; /
*
look at the current solution
*
/
save options; /
*
keep current solution
*
/
show options; /
*
look at settings
*
/
reset options; /
*
change some settings, those that
*
/
/
*
caused optimization to stop
*
/
run; /
*
do more optimization
*
/
print options; /
*
look at the optimal solution
*
/
save options; /
*
keep optimal solution
*
/
If you are interested only in nding the optimal solution, have used SAS variables that have special
names in the input data sets, and want to use default settings for everything, then the following
statement is all you need:
PROC NETFLOW ARCDATA= data set ;
478 ! Chapter 7: The NETFLOW Procedure
PROC NETFLOW Statement
PROC NETFLOW options ;
This statement invokes the procedure. The following options and the options listed with the RESET
statement can appear in the PROC NETFLOW statement.
Data Set Options
This section briey describes all the input and output data sets used by PROC NETFLOW. The
ARCDATA= data set, NODEDATA= data set, and CONDATA= data set can contain SAS variables
that have special names, for instance _CAPAC_, _COST_, and _HEAD_. PROC NETFLOW looks for
such variables if you do not give explicit variable list specications. If a SAS variable with a special
name is found and that SAS variable is not in another variable list specication, PROC NETFLOW
determines that values of the SAS variable are to be interpreted in a special way. By using SAS
variables that have special names, you may not need to have any variable list specications.
ARCDATA=SAS-data-set
names the data set that contains arc and, optionally, nonarc variable information and nodal
supply/demand data. The ARCDATA= data set must be specied in all PROC NETFLOW
statements.
ARCOUT=SAS-data-set
AOUT=SAS-data-set
names the output data set that receives all arc and nonarc variable data, including ows or
values, and other information concerning the unconstrained optimal solution. The supply and
demand information can also be found in the ARCOUT= data set. Once optimization that
considers side constraints starts, you are not able to obtain an ARCOUT= data set. Instead,
use the CONOUT= data set to get the current solution. See the section ARCOUT= and
CONOUT= Data Sets on page 543 for more information.
CONDATA=SAS-data-set
names the data set that contains the side constraint data. The data set can also contain other
data such as arc costs, capacities, lower ow bounds, nonarc variable upper and lower bounds,
and objective function coefcients. PROC NETFLOW needs a CONDATA= data set to solve a
constrained problem or a linear programming problem. See the section CONDATA= Data
Set on page 535 for more information.
CONOUT=SAS-data-set
COUT=SAS-data-set
names the output data set that receives an optimal primal solution to the problem obtained by
performing optimization that considers the side constraints. See the section ARCOUT= and
CONOUT= Data Sets on page 543 for more information.
DUALOUT=SAS-data-set
DOUT=SAS-data-set
names the output data set that receives an optimal dual solution to the problem obtained by
PROC NETFLOW Statement ! 479
performing optimization that considers the side constraints. See the section NODEOUT= and
DUALOUT= Data Sets on page 545 for more information.
NODEDATA=SAS-data-set
DUALIN=SAS-data-set
names the data set that contains the node supply and demand specications. You do not need
observations in the NODEDATA= data set for trans-shipment nodes. (Trans-shipment nodes
neither supply nor demand ow.) All nodes are assumed to be trans-shipment nodes unless
supply or demand data indicate otherwise. It is acceptable for some arcs to be directed toward
supply nodes or away from demand nodes.
The use of the NODEDATA= data set is optional in the PROC NETFLOW statement provided
that, if the NODEDATA= data set is not used, supply and demand details are specied by
other means. Other means include using the MAXFLOW or SHORTPATH option, SUPPLY or
DEMAND list variables (or both) in the ARCDATA= data set, and the SOURCE=, SUPPLY=,
SINK=, or DEMAND= option in the PROC NETFLOW statement.
NODEOUT=SAS-data-set
names the output data set that receives all information about nodes (supply and demand and
nodal dual variable values) and other information concerning the optimal solution found by the
optimizer when neglecting side constraints. Once optimization that considers side constraints
starts, you are not able to obtain a NODEOUT= data set. Instead, use the DUALOUT= data set
to get the current solution dual information. See the section NODEOUT= and DUALOUT=
Data Sets on page 545 for a more complete description.
MPSOUT=SAS-data-set
names the SAS data set that contains converted sparse or dense format input data in MPS
format. Invoking this option directs the NETFLOW procedure to halt before attempting
optimization. For more information about the MPSOUT= option, see the section Converting
Any PROC NETFLOW Format to an MPS-Format SAS Data Set on page 546. For more
information about the MPS-format SAS data set, see Chapter 16.
General Options
The following is a list of options you can use with PROC NETFLOW. The options are listed in
alphabetical order.
ALLART
indicates that PROC NETFLOW uses an all articial initial solution (Kennington and Helgason
1980, p. 68) instead of the default good path method for determining an initial solution
(Kennington and Helgason 1980, p. 245). The ALLART initial solution is generally not as
good; more iterations are usually required before the optimal solution is obtained. However,
because less time is used when setting up an ALLART start, it can offset the added expenditure
of CPU time in later computations.
ARCS_ONLY_ARCDATA
indicates that data for only arcs are in the ARCDATA= data set. When PROC NETFLOW
reads the data in ARCDATA= data set, memory would not be wasted to receive data for nonarc
480 ! Chapter 7: The NETFLOW Procedure
variables. The read might then be performed faster. See the section How to Make the Data
Read of PROC NETFLOW More Efcient on page 566.
ARC_SINGLE_OBS
indicates that for all arcs and nonarc variables, data for each arc or nonarc variable is found in
only one observation of the ARCDATA= data set. When reading the data in the ARCDATA=
data set, PROC NETFLOW knows that the data in an observation is for an arc or a nonarc
variable that has not had data previously read that needs to be checked for consistency. The
read might then be performed faster.
If you specify ARC_SINGLE_OBS, PROC NETFLOW automatically works as if
GROUPED=ARCDATA is also specied. See the section How to Make the Data Read of
PROC NETFLOW More Efcient on page 566.
BYPASSDIVIDE=b
BYPASSDIV=b
BPD=b
should be used only when the MAXFLOWoption has been specied; that is, PROCNETFLOW
is solving a maximal ow problem. PROC NETFLOW prepares to solve maximal ow
problems by setting up a bypass arc. This arc is directed from the SOURCE to the SINK and
will eventually convey ow equal to INFINITY minus the maximal ow through the network.
The cost of the bypass arc must be expensive enough to drive ow through the network, rather
than through the bypass arc. However, the cost of the bypass arc must be less than the cost of
articial variables (otherwise these might have nonzero optimal value and a false infeasibility
error will result). Also, the cost of the bypass arc must be greater than the eventual total cost of
the maximal ow, which can be nonzero if some network arcs have nonzero costs. The cost of
the bypass is set to the value of the INFINITY= option. Valid values for the BYPASSDIVIDE=
option must be greater than or equal to 1.1.
If there are no nonzero costs of arcs in the MAXFLOW problem, the cost of the bypass arc
is set to 1.0 (-1.0 if maximizing) if you do not specify the BYPASSDIVIDE= option. The
reduced costs in the ARCOUT= data set and the CONOUT= data set will correctly reect the
value that would be added to the maximal ow if the capacity of the arc is increased by one
unit. If there are nonzero costs, or if you specify the BYPASSDIVIDE= option, the reduced
costs may be contaminated by the cost of the bypass arc and no economic interpretation can
be given to reduced cost values. The default value for the BYPASSDIVIDE= option (in the
presence of nonzero arc costs) is 100.0.
BYTES=b
indicates the size of the main working memory (in bytes) that PROC NETFLOW will allocate.
The default value for the BYTES= option is near to the number of bytes of the largest
contiguous memory that can be allocated for this purpose. The working memory is used to
store all the arrays and buffers used by PROC NETFLOW. If this memory has a size smaller
than what is required to store all arrays and buffers, PROC NETFLOW uses various schemes
that page information between memory and disk.
PROC NETFLOW uses more memory than the main working memory. The additional memory
requirements cannot be determined at the time when the main working memory is allocated.
For example, every time an output data set is created, some additional memory is required. Do
not specify a value for the BYTES= option equal to the size of available memory.
PROC NETFLOW Statement ! 481
CON_SINGLE_OBS
improves how the CONDATA= data set is read. How it works depends on whether the
CONDATA has a dense or sparse format.
If CONDATA has the dense format, specifying CON_SINGLE_OBS indicates that, for each
constraint, data can be found in only one observation of CONDATA.
If CONDATA has a sparse format, and data for each arc and nonarc variable can be found in
only one observation of CONDATA, then specify the CON_SINGLE_OBS option. If there
are n SAS variables in the ROW and COEF list, then each arc or nonarc can have at most n
constraint coefcients in the model. See the section How to Make the Data Read of PROC
NETFLOW More Efcient on page 566.
COREFACTOR=c
CF=c
enables you to specify the maximum proportion of memory to be used by the arrays frequently
accessed by PROC NETFLOW. PROC NETFLOW strives to maintain all information required
during optimization in core. If the amount of available memory is not great enough to store the
arrays completely in core, either initially or as memory requirements grow, PROC NETFLOW
can change the memory management scheme it uses. Large problems can still be solved. When
necessary, PROC NETFLOW transfers data from random access memory (RAM) or core that
can be accessed quickly but is of limited size to slower access large capacity disk memory.
This is called paging.
Some of the arrays and buffers used during constrained optimization either vary in size, are not
required as frequently as other arrays, or are not required throughout the simplex iteration. Let
a be the amount of memory in bytes required to store frequently accessed arrays of nonvarying
size. Specify the MEMREP option in the PROC NETFLOW statement to get the value for a
and a report of memory usage. If the size of the main working memory BYTES=b multiplied
by COREFACTOR=c is greater than a, PROC NETFLOW keeps the frequently accessed
arrays of nonvarying size resident in core throughout the optimization. If the other arrays
cannot t into core, they are paged in and out of the remaining part of the main working
memory.
If b multiplied by c is less than a, PROC NETFLOW uses a different memory scheme. The
working memory is used to store only the arrays needed in the part of the algorithm being
executed. If necessary, these arrays are read from disk into the main working area. Paging,
if required, is done for all these arrays, and sometimes information is written back to disk at
the end of that part of the algorithm. This memory scheme is not as fast as the other memory
schemes. However, problems can be solved with memory that is too small to store every array.
PROC NETFLOW is capable of solving very large problems in a modest amount of available
memory. However, as more time is spent doing input/output operations, the speed of PROC
NETFLOW decreases. It is important to choose the value of the COREFACTOR= option
carefully. If the value is too small, the memory scheme that needs to be used might not be as
efcient as another that could have been used had a larger value been specied. If the value
is too large, too much of the main working memory is occupied by the frequently accessed,
nonvarying sized arrays, leaving too little for the other arrays. The amount of input/output
operations for these other arrays can be so high that another memory scheme might have been
used more benecially.
482 ! Chapter 7: The NETFLOW Procedure
The valid values of COREFACTOR=c are between 0.0 and 0.95, inclusive. The default value
for c is 0.75 when there are over 200 side constraints, and 0.9 when there is only one side
constraint. When the problem has between 2 and 200 constraints, the value of c lies between
the two points (1, 0.9) and (201, 0.75).
DEFCAPACITY=c
DC=c
requests that the default arc capacity and the default nonarc variable value upper bound be c.
If this option is not specied, then DEFCAPACITY= INFINITY.
DEFCONTYPE=c
DEFTYPE=c
DCT=c
species the default constraint type. This default constraint type is either less than or equal to
or is the type indicated by DEFCONTYPE=c. Valid values for this option are
LE, le, <= for less than or equal to
EQ, eq, = for equal to
GE, ge, >= for greater than or equal to
The values do not need to be enclosed in quotes.
DEFCOST=c
requests that the default arc cost and the default nonarc variable objective function coefcient
be c. If this option is not specied, then DEFCOST=0.0.
DEFMINFLOW=m
DMF=m
requests that the default lower ow bound through arcs and the default lower value bound of
nonarc variables be m. If a value is not specied, then DEFMINFLOW=0.0.
DEMAND=d
species the demand at the SINK node specied by the SINK= option. The DEMAND= option
should be used only if the SINK= option is given in the PROC NETFLOW statement and
neither the SHORTPATH option nor the MAXFLOW option is specied. If you are solving a
minimum cost network problem and the SINK= option is used to identify the sink node, but
the DEMAND= option is not specied, then the demand at the sink node is made equal to the
networks total supply.
DWIA=i
controls the initial amount of memory to be allocated to store the LU factors of the working
basis matrix. DWIA stands for D
W
initial allocation and i is the number of nonzeros and
matrix row operations in the LU factors that can be stored in this memory. Due to ll-in in
the U factor and the growth in the number of row operations, it is often necessary to move
information about elements of a particular row or column to another location in the memory
allocated for the LU factors. This process leaves some memory temporarily unoccupied.
Therefore, DWIA=i must be greater than the memory required to store only the LU factors.
PROC NETFLOW Statement ! 483
Occasionally, it is necessary to compress the U factor so that it again occupies contiguous
memory. Specifying too large a value for DWIA means that more memory is required by
PROC NETFLOW. This might cause more expensive memory mechanisms to be used than
if a smaller but adequate value had been specied for DWIA=. Specifying too small a value
for the DWIA= option can make time-consuming compressions more numerous. The default
value for the DWIA= option is eight times the number of side constraints.
EXCESS=option
enables you to specify how to handle excess supply or demand in a network, if it exists.
For pure networks EXCESS=ARCS and EXCESS=SLACKS are valid options. By default
EXCESS=ARCS is used. Note that if you specify EXCESS=SLACKS, then the interior point
solver is used and you need to specify the output data set using the CONOUT= data set. For
more details see the section Using the New EXCESS= Option in Pure Networks: NETFLOW
Procedure on page 614.
For generalized networks you can either specify EXCESS=DEMAND or EXCESS=SUPPLY
to indicate that the network has excess demand or excess supply, respectively. For more details
see the section Using the New EXCESS= Option in Generalized Networks: NETFLOW
Procedure on page 621.
GENNET
This option is necessary if you need to solve a generalized network ow problem and there are
no arc multipliers specied in the ARCDATA= data set.
GROUPED=grouped
PROC NETFLOW can take a much shorter time to read data if the data have been grouped prior
to the PROC NETFLOW call. This enables PROC NETFLOW to conclude that, for instance,
a new NAME list variable value seen in an ARCDATA= data set grouped by the values of the
NAME list variable before PROC NETFLOW was called is new. PROC NETFLOW does not
need to check that the NAME has been read in a previous observation. See the section How
to Make the Data Read of PROC NETFLOW More Efcient on page 566.
v GROUPED=ARCDATA indicates that the ARCDATA= data set has been grouped by
values of the NAME list variable. If _NAME_ is the name of the NAME list vari-
able, you could use PROC SORT DATA=ARCDATA; BY _NAME_; prior to calling
PROC NETFLOW. Technically, you do not have to sort the data, only ensure that
all similar values of the NAME list variable are grouped together. If you specify
the ARCS_ONLY_ARCDATA option, PROC NETFLOW automatically works as if
GROUPED=ARCDATA is also specied.
v GROUPED=CONDATA indicates that the CONDATA= data set has been grouped.
If the CONDATA= data set has a dense format, GROUPED=CONDATA indicates that
the CONDATA= data set has been grouped by values of the ROW list variable. If _ROW_
is the name of the ROW list variable, you could use PROC SORT DATA=CONDATA;
BY _ROW_; prior to calling PROC NETFLOW. Technically, you do not have to sort the
data, only ensure that all similar values of the ROW list variable are grouped together. If
you specify the CON_SINGLE_OBS option, or if there is no ROW list variable, PROC
NETFLOW automatically works as if GROUPED=CONDATA has been specied.
484 ! Chapter 7: The NETFLOW Procedure
If the CONDATA= data set has the sparse format, GROUPED=CONDATA indicates
that the CONDATA= data set has been grouped by values of the COLUMN list vari-
able. If _COL_ is the name of the COLUMN list variable, you could use PROC SORT
DATA=CONDATA; BY _COL_; prior to calling PROC NETFLOW. Technically, you do
not have to sort the data, only ensure that all similar values of the COLUMN list variable
are grouped together.
v GROUPED=BOTHindicates that both GROUPED=ARCDATAand GROUPED=CONDATA
are TRUE.
v GROUPED=NONE indicates that the data sets have not been grouped, that is, nei-
ther GROUPED=ARCDATA nor GROUPED=CONDATA is TRUE. This is the de-
fault, but it is much better if GROUPED=ARCDATA, or GROUPED=CONDATA, or
GROUPED=BOTH.
A data set like
... _XXXXX_ ....
bbb
bbb
aaa
ccc
ccc
is a candidate for the GROUPED= option. Similar values are grouped together. When PROC
NETFLOW is reading the i th observation, either the value of the _XXXXX_ variable is the same
as the (i 1)st (that is, the previous observations) _XXXXX_ value, or it is a new _XXXXX_
value not seen in any previous observation. This also means that if the i th _XXXXX_ value is
different from the (i 1)st _XXXXX_ value, the value of the (i 1)st _XXXXX_ variable will
not be seen in any observations i. i 1. . . . .
INFINITY=i
INF=i
is the largest number used by PROC NETFLOW in computations. A number too small can
adversely affect the solution process. You should avoid specifying an enormous value for the
INFINITY= option because numerical roundoff errors can result. If a value is not specied,
then INFINITY=99999999. The INFINITY= option cannot be assigned a value less than 9999.
INTPOINT
indicates that the interior point algorithm is to be used. The INTPOINT option must be
specied if you want the interior point algorithm to be used for solving network problems,
otherwise the simplex algorithm is used instead. For linear programming problems (problems
with no network component), PROC NETFLOW must use the interior point algorithm, so you
need not specify the INTPOINT option.
INVD_2D
controls the way in which the inverse of the working basis matrix is stored. How this matrix is
stored affects computations as well as how the working basis or its inverse is updated. The
working basis matrix is dened in the section Details: NETFLOW Procedure on page 534.
If INVD_2D is specied, the working basis matrix inverse is stored as a matrix. Typically,
PROC NETFLOW Statement ! 485
this memory scheme is best when there are few side constraints or when the working basis is
dense.
If INVD_2D is not specied, lower (L) and upper (U) factors of the working basis matrix
are used. U is an upper triangular matrix and L is a lower triangular matrix corresponding
to a sequence of elementary matrix row operations. The sparsity-exploiting variant of the
Bartels-Golub decomposition is used to update the LU factors. This scheme works well when
the side constraint coefcient matrix is sparse or when many side constraints are nonbinding.
MAXARRAYBYTES=m
species the maximum number of bytes an individual array can occupy. This option is of most
use when solving large problems and the amount of available memory is insufcient to store
all arrays at once. Specifying the MAXARRAYBYTES= option ensures that arrays that need
a large amount of memory do not consume too much memory at the expense of other arrays.
There is one array that contains information about nodes and the network basis spanning tree
description. This tree description enables computations involving the network part of the
basis to be performed very quickly and is the reason why PROC NETFLOW is more suited to
solving constrained network problems than PROC LP. It is benecial that this array be stored in
core when possible, otherwise this array must be paged, slowing down the computations. Try
not to specify a MAXARRAYBYTES=m value smaller than the amount of memory needed to
store the main node array. You are told what this memory amount is on the SAS log if you
specify the MEMREP option in the PROC NETFLOW statement.
MAXFLOW
MF
species that PROC NETFLOW solve a maximum ow problem. In this case, the PROC
NETFLOW procedure nds the maximum ow from the node specied by the SOURCE=
option to the node specied by the SINK= option. PROC NETFLOW automatically assigns an
INFINITY= option supply to the SOURCE= option node and the SINK= option is assigned
the INFINITY= option demand. In this way, the MAXFLOW option sets up a maximum ow
problem as an equivalent minimum cost problem.
You can use the MAXFLOW option when solving any ow problem (not necessarily a
maximum ow problem) when the network has one supply node (with innite supply) and
one demand n