Deterministic Dynamic Programming
Deterministic Dynamic Programming
The main idea of dynamic programming (DP) is to break the problem down into
subproblems (more manageable). The calculations are then performed recursively where the
optimal solution of a subproblem is used as input for the next problem. The
solution to the entire problem is available when the last subproblem is solved.
the way recursive calculations are performed depends on how the problem is decomposed
In particular, the subproblems are usually linked by constraints.
common. The feasibility of these common restrictions remains in all iterations.
Figure 1
1. The calculations at each stage are a function of the feasible routes of that stage, and only of
that stage.
A current stage is connected only to the immediately preceding stage (without having to
count the previous stages) based on the summary of the shortest distances of the stage
immediately preceding.
Recursive equation. This section shows how they can be mathematically expressed.
recursive calculations in example 1.
1
Dynamic programming
Seafi(xi) the shortest distance to the node in the stage, and define d(xi-1,xi) as the distance
from nodoxi-1 to nodoxi.
All distances are measured from 0 when establishing f.0(x0= 1) = 0. The main recursive equation
express the shortest distance fi(xi) at the stage as a function of the next node, xi. In
PD terminology is known as the state in the stage. The state connects the stages.
successively in a way that allows for optimal feasible decisions to be made at a future stage
regardless of the decisions that have been made at all previous stages.
The definition of the state leads to the following unifying framework for the PD.
Example 1 uses forward recursion in which calculations proceed from the stage
1 to stage 3. The same example can be solved through backward recursion.
starting at stage 3 and ending at stage 1.
Naturally, forward and backward recursion gives the same optimal solution. Even
when the forward procedure seems more logical, most of the PD literature
use backward recursion. The reason for this preference is that, generally, the
Back recursion can be more efficient from a computational point of view.
The demonstration will also give us the opportunity to present the PD calculations.
in a compact tabular form.
As I studied each application, I paid special attention to the three basic elements of
PD model:
2
Dynamic programming
Of the three elements, the definition of 'state' tends to be the most subtle. The applications that are
they present here show that the definition of the state varies according to the situation that must be
model. However, as you research each application, you will realize that it is useful.
consider the following questions:
You can improve your understanding of the concept of state if you question the validity of the way in which
It was defined here. Try another definition that seems "more logical" to you and use it in the calculations.
recursive. You will soon discover that the definitions presented here are correct. In the meantime, the
The mental process associated will allow you to better understand the role of the states in development.
recursive equation of PD.
The problem of lamochila is also known in the literature as the team problem of
flight (determination of the most valuable items that a jet pilot carries on board) and the
problem unloading a container (determination of the most valuable items that are
they will be loaded onto a navy ship). It seems that the three names were coined for
guarantee an equal representation of the three branches of the armed forces: Army, Force
Air and Navy!
The recursive equation (backward) is developed for the general problem of assigning items.
to a backpack with a weight capacity of W. Seamila quantity of units of the item in the
backpack, and I will define unit income and the weight of the item. The general problem
is represented as:
3
Dynamic programming
The three elements of the model are
1. The stage is represented by the article i, i = 1, 2, ..., n.
2. The alternatives in the stage are the quantity of units of the article, mi=0.1,...(W/Wi),
where (W/Wi) is the largest integer that is less than or equal to W/Wi. This definition allows for the
solution distribute some, none, or all resources or any of the articles.
performance paramiesriI.
3. The state in the stage is represented by xi, the total weight assigned to the stages.
(articles)i,i+ 1,…, yn. This definition recognizes that the weight limit is the
the only restriction that links all the stages.
Define
fi(xi) = maximum yield for the stages i, i+1, and n, given the state xi
Example 2
A 4-ton boat can be loaded with one or more of three items. The following table gives the
unit price, wi, in tons and the unit income in thousands of dollars, ri, for the article i. The
The objective is to determine the quantity of units of each item that will maximize the yield.
total.
4
Dynamic programming
Since the unit weight and the maximum weight are integers, the state only assumes values.
integers.
Assuming the project duration is ten weeks and that the strength of
minimum labor required in the weeks of workers. The model assumes
an additional cost is incurred if the workforce for a week exceeds the requirement
minimum or if an additional hiring is made in a week. For simplicity,
There is no cost incurred when a dismissal occurs.
The cost of maintaining a workforce greater than the minimum is well the week
incurs in excess costC1(xI-biSixIis greater than xI-1, hiring occurs for a
additional cost of C2(xi–xi-1).
The elements of the PD model are defined as follows:
1.The stage is represented by the week i, i=1, 2,…,n.
2. The alternatives in stage eleven, the number of workers in the week.
3. The state at stage i is xi21, the number of workers available in week i=1.
Example 3
A contractor estimates that the size of the workforce needed during the next 5
weeks consist of 5, 7, 8, 4, and 6 workers, respectively. The preserved surplus labor.
in the workforce will cost $300 per worker per week, and a new hire in
Any week will incur a fixed cost of $400 plus $200 per worker per week.
Machines that remain in service for a long time incur a high cost of
maintenance and can be replaced after a certain number of years in operation.
The situation involves determining the most economical age of a machine.
5
Dynamic programming
Let's assume that the machine replacement problem spans years. At the beginning of each
year, a machine is either kept in service for another year or is replaced by a new one. They are
r(t), c(t) ys(t) the annual income, the operating cost, and the salvage value, respectively, of
a machine of years. The acquisition cost of a new machine in any year is I.
The elements of the PD model are the following:
1. The stage is represented by the year i=1, 2,…,n.
The alternatives at this stage (year) are to keep (K) or replace (R) the machine at the beginning.
of the year.
3. The state at the stage is the age of the machine at the beginning of the year.
Since the machine is years old at the beginning of the year, define
Example 4
A company needs to determine the replacement policy for a machine that is currently
is three years old, for the next 4 years (n=4). A machine that is 6 years old
must be replaced. The cost of a new machine is $100,000. The following table provides the
problem data.
The determination of feasible values for the age of the machine is somewhat complicated.
Figure 2 summarizes the network that represents the problem. At the beginning of year 1, we have a machine.
3 years old.
6
Dynamic programming
Figure 2
We can either replace it (R) or keep it (K) for another year. If the replacement occurs, the
the new machine will be one year old at the beginning of year 2; otherwise, the preserved machine
will be 4 years old. The same logic applies to the start of years 2 to 4.
The network shows that at the beginning of year 2 the possible ages of the machine are 1 and 4 years. At the beginning
At year 3, the possible ages are 1, 2, and 5 years old, and at the beginning of year 4, the possible ages are 1, 2, and 3.
and 6 years. The network also assumes that the machine will be discarded at the beginning of year 5
regardless of age.
The solution of the network shown in figure 2 is equivalent to finding the longest path (that is, the
maximum income) from the beginning of year 1 to the end of year 4. We will use the form
Table to solve the problem. All values are in thousands of pesos.
Let's observe that if a machine is replaced in year 4 (that is, at the end of the horizon of
planning), its income will include the salvage value, s(t), of the replaced machine and the value
from waste, (1), from the replacement machine. Furthermore, if in year 4 a machine of years
age is preserved, its salvage value will be (t+1).
7
Dynamic programming
Investment model
Suppose you want to invest the amounts P1, P2, ..., Pn, at the beginning of each of the following
children. He has two investment opportunities in two banks. BBVA pays an interest rate
CityBanamex pays 2, both compounded annually. To encourage deposits, both
banks pay bonuses on new investments in the form of a percentage of the amount
inverted. The percentages of the respective bonds for BBVA and CityBanamex are qi1 and qi2
for the year. The bonds are paid at the end of the year in which the investment was made and can be reinvested.
at any of the banks in the immediately following year. This means that only
bonds and new fresh money can be invested in any of the banks. However, once
that when a deposit is made, the investment must remain in the bank until the end of the year.
The reinvested amount includes only new money plus any investment bonds.
completed in the year
Define
fi(xi) = optimal value of investments in the years i, i+1,…, yn, given xi.
Then define as the accumulated sum at the end of the year, given that Iiy (xi-Ii) are the
Investments made in the year in BBVA and in CityBanamex, respectively.
8
Dynamic programming
The terms qn1 and qn2 are added because the bonds for the year are part of
the final accumulated sum of money from the investment.
Example 5
Suppose you want to invest $4000 now and $2000 at the beginning of years 2 to 4. The interest rate
offered by BBVA is 8% compounded annually, and the bonds over the next 4 years
1.8%, 1.7%, 2.1% and 2.5%, respectively. The annual interest rate offered by
CityBanamex is 0.2% lower than that of BBVA, but its bonds are 0.5% higher. The goal
It is to maximize the capital accumulated after 4 years.
DIMENSIONALITY PROBLEM
In all the PD models presented in this chapter, the state at any stage is
represented by a single element. For example, in the backpack model (which was seen at the beginning
from this unit), the only restriction is the weight of the item. More realistically in this case,
the volume of the backpack can also be a viable constraint, in which case it is said that in
At any stage, the state is bidimensional: weight and volume.
The increase in the number of state variables increases the calculations at each stage. This is
particularly evident in PD tabular calculations due to the number of rows in each
Table corresponds to all possible combinations of the state variables. This difficulty
Computationally, it is sometimes known in the literature as the curse of dimensionality.
The following example was chosen to demonstrate the problem of dimensionality. It also serves
to demonstrate the relationship between linear programming and dynamic.
9
Dynamic programming
Example 6
SNK Manufacturing produces two products. The daily capacity of the manufacturing process is
430 minutes. Product 1 requires 2 minutes per unit, and product 2 requires 1 minute per unit.
unit. There is no limit to the amount produced of product 1, but the daily demand for
product 2 has 230 units. The unit profit of product 1 is $2 and that of product 2
It is $5. Determine the optimal solution using DP.
10
Dynamic programming
2. For the network in the figure, it is desired to determine the shortest route between cities 1 and 7. Define the
stages and states through backward recursion, and then solve the problem.
3. A student must choose 10 elective courses from four different departments, with at least
less one course from each department. The 10 courses are assigned to the four departments of a
way to maximize "knowledge". The student measures their knowledge on a scale of 100
points and appears with the following table:
Habitat for Humanity is a wonderful charity organization that builds houses for families.
needed through voluntary labor and donations of construction materials. A
Eligible families can choose from three house sizes: 1000, 1100, and 1200 square feet. 2
11
Dynamic programming
high need. During the following 6 months, the branch can have a maximum of 23
volunteers. The following data summarizes the ratings of the applications and the required amount
of volunteers.
Which requests must the committee approve?
5. Luxor Travel organizes week-long tourist trips to southern Egypt. The agency offers 7,4,7 and
8 cars for rent during the next 4 weeks. Luxor Travel subcontracted to a
local automotive dealership to meet the needs of car rentals.
The dealership charges a weekly rental fee of $220 per car, plus a fixed fee of $500.
for any rental transaction. Luxor, however, can choose whether to keep them in rental for
One more week and just keep paying the rent. What is the best way for Luxor
how do you handle the rental situation?
6. My 13-year-old son runs a lawn mowing business with 10 clients. He mows for each client.
the lawn 3 times a year, and charges $50 for each cut. Just paid $200 for a new mower.
The operating and maintenance cost of the cutter is $120 for the first year of service and
From there it increases by 20% per year. A one-year-old cutter has a resale value.
from $150, which is then reduced by 10% per year. My son, who plans to keep his business
Until he is 16 years old, he thinks it is more economical to buy a new cutter every 2 years.
Base your decision on the fact that the price of a new cutter will increase by only 10% when
year. Is their decision justified?
7. Solve class problem 5, assuming quer1 = 0.085 yr2 = 0.08. Also, assume that
$5000
12