0% found this document useful (0 votes)

23 views101 pages

Week 2

Uploaded by

24f2005252

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views101 pages

Week 2

Uploaded by

24f2005252

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 101

Lecture – 06

Topic: Unconstrained Optimization

• Formulation of Optimization Problems

• Convex and nonconvex optimization

• Unconstrained Optimization: gradient descent

• Multicriteria Optimization and Pareto Optimality

Optimization problem in Economics
• We have a budget of M, which we can invest in two sectors S1 and S2

• The two sectors provide returns of f1(x) and f2(x) respectively for investment ‘x’

• I need to decide how to divide my budget in the two sectors to maximize the net return

• Utility function: F(x) = f1(x) + f2(M-x)

• Maximize F(x) with respect to x!
Optimization problem in Economics
• We have a budget of M, which we can invest in two sectors S1 and S2

• The two sectors provide returns of f1(x) and f2(x) respectively for investment ‘x’

• I need to decide how to divide my budget in the two sectors to maximize the net return

• Utility function: F(x) = f1(x) + f2(M-x)

• Maximize F(x) with respect to x!

• Simplest case: f1 and f2 are smooth differentiable functions, ‘x’ is continuous

• We can calculate the derivate F’(x), equate it to 0 and solve for ‘x’!
Multi-variable functions and Indifference Curves
We can extend above concept for multi-variable cases
also!

Define utility function F(x1,x2) = f1(x1) + f2(x2)

Usually (x1,x2) will have some constraints, like total
budget
x1 + x2 <= M

Certain combinations of (x1, x2) may provide same

value of F(x1, x2)
We plot such values along “indifference curves”
Each indifference curve associated with a value of
objective function!
Gradient Descent based Optimization
How to solve an optimization problem?

We usually cast optimization problems as minimization (flip signs if needed)!

Linear Regression: squared error loss function (h(x) - y)2 where h(x) = W.X + b
Optimal values of (W,b): argmin ∑i (W.Xi +b - yi)2
Optimal values can be calculated by equating the derivatives to 0

In some cases, analytical approach does not work

Reasons:
1) Utility function not differentiable
2) derivative equations cannot be solved

In such cases, we need numerical approach!

Gradient Descent based Optimization
Numerical approach to minimizing any function f:

1) Start with any initial point x0

2) Move to point x1 = x0 – a.f’(x0) /// f’(x0): derivative of f(x) at x=x0
/// a: constant learning rate
3) If x1 = x0, STOP /// x0 is a minima of function f
else set x0 = x1, GOTO 2

Above approach works for vector-valued ‘x’ also!

f’(x): vector of partial derivatives w.r.t. each variable!

For vector x, optimization for each dimension can be decoupled from the rest!
Gradient Descent based Optimization
Consider f(x) = x2 – 4x + 4; f’(x) = 2x – 4
Set initial point x=5, learning rate a = 0.1
Gradient Descent based Optimization
Consider f(x) = x2 – 4x + 4; f’(x) = 2x – 4
Set initial point x=5, learning rate a = 0.1
Gradient Descent based Optimization
Consider f(x) = x2 – 4x + 4; f’(x) = 2x – 4
Set initial point x=5, learning rate a = 0.1
Gradient Descent based Optimization
• Does it always converge?
- Depends on the “learning rate”
- low learning rate: slow convergence
- high learning rate: may oscillate around the minima!
• Does it always give the optimal solution?
- Yes, if the function is Convex (has unique minima)
• - Otherwise, it converges at any minima

• Alternative: Newton-Raphson’s Method

• Update step: x1 = x0 – (f’(x0))-1*f(x0)
- faster convergence, learning rate gone!
-Performance depends on properties of Hessian (f’(x0))-1
Convex and Non-Convex Optimization
Convex Function:
Value f(x) at any ‘x’ between ‘x1’ and ‘x2’ is always
less than the straight line connecting f(x1) and f(x2)

Convex functions have unique minima

Numerical methods will converge there (if they at all converge)

Non-convex function: can have multiple minima!

Source: Wikipedia

Numerical methods can converge to any local minima (closest to initial point)
Solution: i) start descent from multiple initial points
ii) Find local minima for each of them
iii) Find minima of these local minima!
Multi-objective Optimization
Multi-objective Optimization
Multi-objective Optimization
Multi-objective Optimization
Pareto-Optimal Solutions
• In multi-objective optimization, goodness of a solution is defined by “Dominance”

• A solution ‘x1’ dominates another solution ‘x2’ if (assuming minimization problem for all objectives)
i. fi(x1) <= fi(x2) for every objective ‘i’
ii. fj(x1) < fj(x2) for at least one objective ‘j’

• Given two solutions (x1, x2), it is possible that neither dominates the other!
• A solution ‘x’ is pareto-optimal if it is not dominated by any other solution!

1 dominates 2 The non-dominated set

5 dominates 1 of all possible solutions:
(1, 4) do not pareto-optimal set!
dominate each
other

Credits: S.D. Sudhoff

Pareto-Optimal Solution
Pareto-Optimal Front
• The boundary defined by the set of all points in the objective function space, mapped by the pareto-
optimal solutions, is the pareto-optimal front!

• Objective of multi-objective optimization:

- Find a set of solutions as close to the pareto-optimal front as possible
- To find a set of pareto-optimal solutions as diverse as possible! (each solution is good w.r.t. a
different objective)

• Possible Solution: attach weights to each objective to define utility function!

F(x) = w1*f1(x) + w2*f2(x) + …….. + wn*fn(x)
Credits: S.D. Sudhoff
Pareto-Optimal Solutions in Economics
• Resource allocation among multiple sectors: how to divide a budget among multiple sectors so that the
return from at least some of the sectors should be better than others

• Pareto Improvements in Trade: International trade can lead to Pareto improvements by allowing
countries to specialize in producing goods where they have a comparative advantage and then trade for
goods produced more efficiently by other countries.

• Income Redistribution: A transfer of resources from a wealthy individual to a poorer individual can be
Pareto-improving if it increases overall social welfare without making the wealthy individual worse off.

• Social Welfare Maximization: A social welfare function that achieves Pareto optimality
ensures that no feasible reallocation of resources can make any individual better off
without making someone else worse off.
Lecture – 07
Topic: Constrained Optimization
• Linear Program

• Lagrange Multipliers

• Exterior Point Methods

• Interior Point Methods

Idea of Constrained Optimization
• We have an objective function to minimize/maximize w.r.t. one or more variables
- Allocate resources to different sectors to maximize the net revenue/utility from all of them

• However, not all possible solutions are acceptable!

• A ‘feasible solution’ must satisfy one or more constraints (represented by equations/inequalities)
- I may have a budget (sum of my allocations cannot exceed the budget)
- I may not want to allocate more than a given fraction of my budget to any single sector (fairness)
- I may not want to allocate more to a particular sector compared to another particular sector

• Set of solutions that satisfy all constraints: feasible set!

Constrained Linear Optimization
Simplest case: objective function and constraint equations are all linear!
Linear program!
Constrained Linear Optimization
Simplest case: objective function and constraint equations are all linear!
Linear program!

• We convert problem constraints into system of

linear equations with the help of slack
variables!

• Characterize the solutions of the system of

linear equations

• A certain solution of the

obtained system would be
an optimal solution
Constrained Linear Programming

Slack variables: s1, s2!

Fundamental Theorem of Linear Programming
If the optimal value of the objective function in a linear
programming problem exists, then that value must occur at
one or more of the basic feasible solutions of the initial
system.

For small (eg. 2D) systems, we can solve graphically

In general, we try to find basic feasible solutions, and

evaluate objective at all of them to find minima

Simplex algorithm: achieves the same through

successive row and column operations on matrix
Lagrange Multiplier with Inequality Constraints
• Consider general objective and constraints (may be non-linear)
• General approach: Convert the constraints into parts of the objective!

• Original problem:
• Min f(x) such that h1(x) = 0, h2(x) = 0 …… hk(x) = 0

• Alternative problem: Min F(x, λ1, λ2, …. λk) = f(x) + λ1*h1(x) + λ2*h2(x) + …… + λk*hk(x)
• The idea: if λ1, λ2 etc are high enough, then the optimal solution must have low values of h1(x), h2(x) etc
automatically!

• We now have unconstrained minimization problem with additional variables

• Approach: Find the stationery points of F with respect to all the variables,
i.e. partial derivatives of F should be 0 w.r.t. all variables

Solution correctness guaranteed by Lagrange Multiplier Theorem!

Solving with Lagrange Multipliers
• General approach: Convert the constraints into parts of the objective!
• Original problem:
• Min f(x) such that g(x) = 0, h(x) <=0 [any number of inequality and equality constraints]

• All inequality constraints are converted to equality constraints with slack variables
• Lagrange Multipliers added as earlier

• Augmented objective: F(x) = f(x) + Σiλihi(x) + Σjμj(gj(x) + sj2)

• Karush-Kahn-Tucker (KKT) Conditions for optimality of solution:

1) Gradient of F(x) = 0
2) Constraints: h[x] = 0 & g[x] ≤ 0 (all constraints satisfied)
3) Complementary Slackness ( for the sj2 variables) μ.s = Σj μj *sj = 0
4) Sign condition on the inequality multipliers: μj ≥ 0

If all of these are satisfied by x*, then it is the optimal solution to original problem
Interior Point Methods
• Basic Approach: start with a point within the feasible set (need not be optimal w.r.t. objective)
• Perturb the solution towards optimal (maybe gradient descent)
• Check if constraints are violated – limit the perturbation accordingly!
• Implemented using Barrier Functions – very low inside feasible region, very high along and beyond the borders!
Interior Point Methods
• Basic Approach: start with a point within the feasible set (need not be optimal w.r.t. objective)
• Perturb the solution towards optimal (maybe gradient descent)
• Check if constraints are violated – limit the perturbation accordingly!
• Implemented using Barrier Functions – very low inside feasible region, very high along and beyond the borders!
Interior Point Methods
• Basic Approach: start with a point within the feasible set (need not be optimal w.r.t. objective)
• Perturb the solution towards optimal (maybe gradient descent)
• Check if constraints are violated – limit the perturbation accordingly!
• Implemented using Barrier Functions – very low inside feasible region, very high along and beyond the borders!
Interior Point Methods
• Basic Approach: start with a point within the feasible set (need not be optimal w.r.t. objective)
• Perturb the solution towards optimal (maybe gradient descent)
• Check if constraints are violated – limit the perturbation accordingly!
• Implemented using Barrier Functions – very low inside feasible region, very high along and beyond the borders!
Exterior Point Methods
• Basic approach: start at initial point outside feasible region
• Optimize w.r.t. objective function plus constraints
• Gradually increase constraint weights till they are satisfied
• Automatically pushes solution towards the feasible set
Exterior Point Methods
• Basic approach: start at initial point outside feasible region
• Optimize w.r.t. objective function plus constraints
• Gradually increase constraint weights till they are satisfied
• Automatically pushes solution towards the feasible set
Exterior Point Methods
• Basic approach: start at initial point outside feasible region
• Optimize w.r.t. objective function plus constraints
• Gradually increase constraint weights till they are satisfied
• Automatically pushes solution towards the feasible set
Lecture – 08
Topic: Heuristic Search Techniques
• Graph Search

• Shortest Path Algorithms

• Best-First Approach: A*

• Examples and properties

Graph
• Graph: A collection of vertices, some pairs of which may be connected by edges!
• Edges can be directed or undirected!
• Each vertex may indicate one state of a system
• Neighbouring vertices: if system can move from one state to another!

Directed Graph Undirected Graph

Graph Paths and Connectivity
• Path: a sequence of edges (head-tail in case of directed graphs)
• Node A connected to Node B if there is a path from A to B
• In directed graph, A->B path exists (red, orange), A->C path doesn’t exist
• In undirected graph, both A->B and A->C paths exist

A A

B B

C C

Directed Graph Undirected Graph

Path between every pair of vertices: graph is connected!

Graph Search
We start at a particular vertex (root)
Our aim: discover the other vertices by following the edges
We may look for a specific vertex also

Breadth First Search: whenever we reach a vertex, note down its unexplored neighbors, visit them in order (queue)
Depth First Search: whenever we reach a vertex, pick one of its unexplored neighbors and explore it (stack)

1 2 1 2

2 3 3 4

3 6
3 5

Breadth-First Search Depth-First Search

Shortest Paths in Weighted Graphs
Weighted graph: each edge has a non-negative weight!
Cost of a path: sum of weights of all edges along it
Shortest path problem: find the path between two vertices that has least cost

5 A->C->B path cost: 6

A 1 A->C->D->B path cost: 7
2 4
C B So A->C->B is shortest path!
1 3 2
D

Dijkstra’s Algorithm: finds shortest path from fixed origin to all other vertices
Floyd-Warshall’s Algorithm: finds shortest path between all pairs of vertices!
Heuristic Search
• BFS, DFS involve backtracking (revisiting vertices)
• Shortest-path algorithms assume the entire graph will be revealed together
• These may not be possible in an actual graph-search scenario
• We may have to choose one vertex at each step, and may not be able to backtrack!
• Our final aim: to reach a “goal vertex” at least cost!
• Eg. Chess!

5
START 1
2 2
4 GOAL
1
3 2 4

GOAL
6
Heuristic Search
• We need to have an estimate – how far is a goal vertex from a given node?
• From current vertex, we will move to that neighbouring vertex through which the shortest path seems to pass
(based on estimates)!
• Heuristic function h(x): estimated shortest path cost from vertex x to any goal vertex (available for each vertex)

5 1
START 7 1
2 2
5 4 1 GOAL
1
6 3 2 4

4 GOAL
6

Estimated path cost for node x: f(x) = g(x) + h(x)

g(x): cost from START to vertex x
A* Algorithm
• Set x=START vertex, Add x to OPEN list, Set f(x)=h(x), g(x)=0, g(y)= INFTY for all other vertices ‘y’, CLOSED list: {}
• Repeat:
• 1) At the current node x, if x=GOAL, stop and report g(x)
• 2) Else, for each neighbor y of x (that is not in CLOSED list)
Add y to OPEN list (if not already present)
If g(y) > g(x) + c(x,y) [c(x,y): weight of edge x->y]
Set g(y) = g(x) + c(x,y), f(y) = g(y) + h(y), set Prev(y)=x
• 3) Add x to CLOSED list, Move to the vertex in OPEN list with least value of f, and GOTO 1!
h(x) g(x) f(x) Prev(x)
5 D
START A 1
GOAL A 7 0 7 -
2 2
C 4 E G B 6 100
1
B 3 2 4 C 5 100

F H D 1 100
6
GOAL E 1 100
F 4 100
A* Algorithm
• Set x=START vertex, Add x to OPEN list, Set f(x)=h(x), g(x)=0, g(y)= INFTY for all other vertices ‘y’, CLOSED list: {}
• Repeat:
• 1) At the current node x, if x=GOAL, stop and report g(x)
• 2) Else, for each neighbor y of x (that is not in CLOSED list)
Add y to OPEN list (if not already present)
If g(y) > g(x) + c(x,y) [c(x,y): weight of edge x->y]
Set g(y) = g(x) + c(x,y), f(y) = g(y) + h(y), set Prev(y)=x
• 3) Add x to CLOSED list, Move to the vertex in OPEN list with least value of f, and GOTO 1!
h(x) g(x) f(x) Prev(x)
5 D
START A 1
GOAL A 7 0 7 -
2 2
C 4 E G B 6 100
1
B 3 2 4 C 5 2 7 A

F H D 1 5 6 A
6
GOAL E 1 100
F 4 100
A* Algorithm
• Set x=START vertex, Add x to OPEN list, Set f(x)=h(x), g(x)=0, g(y)= INFTY for all other vertices ‘y’, CLOSED list: {}
• Repeat:
• 1) At the current node x, if x=GOAL, stop and report g(x)
• 2) Else, for each neighbor y of x (that is not in CLOSED list)
Add y to OPEN list (if not already present)
If g(y) > g(x) + c(x,y) [c(x,y): weight of edge x->y]
Set g(y) = g(x) + c(x,y), f(y) = g(y) + h(y), set Prev(y)=x
• 3) Add x to CLOSED list, Move to the vertex in OPEN list with least value of f, and GOTO 1!
h(x) g(x) f(x) Prev(x)
5 D
START A 1
GOAL A 7 0 7 -
2 2
C 4 E G B 6 100
1
B 3 2 4 C 5 2 7 A

F H D 1 5 6 A
6
GOAL E 1 6 7 D
F 4 100
A* Algorithm
• Set x=START vertex, Add x to OPEN list, Set f(x)=h(x), g(x)=0, g(y)= INFTY for all other vertices ‘y’, CLOSED list: {}
• Repeat:
• 1) At the current node x, if x=GOAL, stop and report g(x)
• 2) Else, for each neighbor y of x (that is not in CLOSED list)
Add y to OPEN list (if not already present)
If g(y) > g(x) + c(x,y) [c(x,y): weight of edge x->y]
Set g(y) = g(x) + c(x,y), f(y) = g(y) + h(y), set Prev(y)=x
• 3) Add x to CLOSED list, Move to the vertex in OPEN list with least value of f, and GOTO 1!
h(x) g(x) f(x) Prev(x)
5 D
START A 1
GOAL A 7 0 7 -
2 2
C 4 E G B 6 3 9 C
1
B 3 2 4 C 5 2 7 A

F H D 1 5 6 A
6
GOAL E 1 6 7 D
F 4 5 9 C
A* Algorithm
• Set x=START vertex, Add x to OPEN list, Set f(x)=h(x), g(x)=0, g(y)= INFTY for all other vertices ‘y’, CLOSED list: {}
• Repeat:
• 1) At the current node x, if x=GOAL, stop and report g(x)
• 2) Else, for each neighbor y of x (that is not in CLOSED list)
Add y to OPEN list (if not already present)
If g(y) > g(x) + c(x,y) [c(x,y): weight of edge x->y]
Set g(y) = g(x) + c(x,y), f(y) = g(y) + h(y), set Prev(y)=x
• 3) Add x to CLOSED list, Move to the vertex in OPEN list with least value of f, and GOTO 1!
h(x) g(x) f(x) Prev(x)
5 D
START A 1
GOAL A 7 0 7 -
2 2
C 4 E G B 6 3 9 C
1
B 3 2 4 C 5 2 7 A

F H D 1 5 6 A
6
GOAL E 1 6 7 D
Shortest path: A->D->E->G (cost 8)
Note: A->C->E->G is also 8, but wasn’t chosen! F 4 5 9 C
Guarantees of A* Algorithm
• At any point, f(x) indicates the cost of the shortest path from START to a GOAL VIA ‘x’ and the
vertices visited till then
• Guaranteed to find shortest path if h(x) is ADMISSIBLE
• Admissible heuristic function: always UNDER-ESTIMATES least cost to any Goal state!
• Otherwise, we may choose wrong vertex and never come back to correct vertex!
• If h(x) is too low, we may have to search entire graph (equivalent to Breadth First Search)

h(x) g(x) f(x) Prev(x)

5 D
START A 1 GOAL A 7 0 7 -
2 2
C 4 E G B 6 100
1
B 3 2 4 C 5 100

F D 10 100
H
4
GOAL E 4 100
F 4 100
Guarantees of A* Algorithm
• At any point, f(x) indicates the cost of the shortest path from START to a GOAL VIA ‘x’ and the
vertices visited till then
• Guaranteed to find shortest path if h(x) is ADMISSIBLE
• Admissible heuristic function: always UNDER-ESTIMATES least cost to any Goal state!
• Otherwise, we may choose wrong vertex and never come back to correct vertex!
• If h(x) is too low, we may have to search entire graph (equivalent to Breadth First Search)

h(x) g(x) f(x) Prev(x)

5 D
START A 1 GOAL A 7 0 7 -
2 2
C 4 E G B 6 100
1
B 3 2 4 C 5 2 7 A

F D 10 5 15 A
H
4
GOAL E 4 100
F 4 100
Guarantees of A* Algorithm
• At any point, f(x) indicates the cost of the shortest path from START to a GOAL VIA ‘x’ and the
vertices visited till then
• Guaranteed to find shortest path if h(x) is ADMISSIBLE
• Admissible heuristic function: always UNDER-ESTIMATES least cost to any Goal state!
• Otherwise, we may choose wrong vertex and never come back to correct vertex!
• If h(x) is too low, we may have to search entire graph (equivalent to Breadth First Search)

h(x) g(x) f(x) Prev(x)

5 D
START A 1 GOAL A 7 0 7 -
2 2
C 4 E G B 6 3 9 C
1
B 3 2 4 C 5 2 7 A

F D 10 5 15 A
H
4
GOAL E 4 6 10 C
F 4 5 9 C
Guarantees of A* Algorithm
• At any point, f(x) indicates the cost of the shortest path from START to a GOAL VIA ‘x’ and the vertices
visited till then
• Guaranteed to find shortest path if h(x) is ADMISSIBLE
• Admissible heuristic function: always UNDER-ESTIMATES least cost to any Goal state!
• Otherwise, we may choose wrong vertex and never come back to correct vertex!
• If h(x) is too low, we may have to search entire graph (equivalent to Breadth First Search)
• Simple choice of h(x): minimum cost to any of its neighbors!

h(x) g(x) f(x) Prev(x)

5 D
START A 1 GOAL A 7 0 7 -
2 2
C 4 E G B 6 3 9 C
1
B 3 2 4 C 5 2 7 A

F D 10 5 15 A
H
4
GOAL E 4 6 10 C

Path found: A->C->F->H (cost 9)! F 4 5 9 C

Applications in Economics
• Consider each vertex represents a state of the economy
• From current state we wish to reach a desired state, but we do not have the privilege of
knowing full graph
• From current state, a few actions can be taken, and we can estimate their impacts (heuristic
function)
• Eg. during Covid, we can either impose a lockdown or let things carry on normally
• State defined by economy and public health

• Supply Chain Optimization: Companies often need to optimize their supply chains to minimize
costs or maximize efficiency. Heuristic graph search algorithms can be used to find the most
efficient routes for transportation and distribution, taking into account factors like
distance, time, and costs.

• Portfolio Optimization: Selecting the best combination of assets to maximize returns

while minimizing risk. Heuristic graph search algorithms can help investors explore
the vast space of possible asset combinations more efficiently, considering factors
like expected returns, volatility, and correlations.
• Graph Search: finding shortest path from start vertex to any goal vertex

• Can use graph algorithms like Dijkstra’s Algorithm if full graph is known

• For limited-horizon search, we need to estimate distances to goal state using Heuristic function

• A*: well-known heuristic search algorithm (Best-First Search)

• Guaranteed to give right solution is heuristic is admissible -> always UNDERESTIMATES

cost to goal

• Too low value of heuristic function: slows down search process, explores unnecessary
vertices!
Lecture – 09
Topic: Multi-objective Heuristic Search and Game Trees
• Game Tree

• Alpha-Beta Pruning

• Extending to Multi-objective case

Why is Chess So Hard?
• At each step, many possibilities available, many goal states
• We need a heuristic to estimate the distance from a state to a goal state (how?)
• It is difficult to think ahead beyond 2-3 steps!
• Plus, the opponent also moves!

chesscoach.com
Why is Chess So Hard?
• At each step, many possibilities available, many goal states
• We need a heuristic to estimate the distance from a state to a goal state (how?)
• It is difficult to think ahead beyond 2-3 steps!
• Plus, the opponent also moves!
• Tic-tac-toe also has similar structure with much fewer configurations!
Game Tree Formulation
• Can be used to represent any two/multi-player contest, one player moves at each step turnwise
• Vertex represents state of the contest, labelled with a heuristic function
• Alternate levels of tree represents the move by a player
• At each level, the player can move from current state to another by choosing an action
• Each action has a cost, each player aims to reach a goal state at minimum total cost
• In some cases, whoever reaches a goal state first, wins!

P1 MOVES

P2 MOVES

P1 MOVES
Game Tree Formulation
• Can be used to represent any two/multi-player contest, one player moves at each step turnwise
• Vertex represents state of the contest, labelled with a heuristic function
• Alternate levels of tree represents the move by a player
• At each level, the player can move from current state to another by choosing an action
• Each action has a cost, each player aims to reach a goal state at minimum total cost
• In some cases, whoever reaches a goal state first, wins!
• Each state has a heuristic function, indicating “goodness” of state to one player

P1 MOVES
8
2 4
3

6 9 7 P2 MOVES
2 7 3 4 8
6

2 8 12 5 P1 MOVES
Game Tree Formulation
• Can be used to represent any two/multi-player contest, one player moves at each step turnwise
• Vertex represents state of the contest, labelled with a heuristic function
• Alternate levels of tree represents the move by a player
• At each level, the player can move from current state to another by choosing an action
• Each action has a cost, each player aims to reach a goal state at minimum total cost
• In some cases, whoever reaches a goal state first, wins!
• Each state has a heuristic function, indicating “goodness” of state to one player

P1 2
P1 MOVES
P2 8
2 4
3

6 9 7 P2 MOVES
2 7 3 4 8
6

P1 2
P1 MOVES
P2 7 8
2 4
3

6 9 7 P2 MOVES
2 7 3 4 8
6

2 8 12 5 P1 MOVES
Game Tree Formulation
• At each step, P1 tries to reach a vertex with
8 P1 MOVES least value of heuristic function
4 • At next step, P2 tries to reach a vertex with
2 3 maximum value of heuristic function!
P2 MOVES • Each leaf node characterized by a value
6 9 7
• Low values preferred by P1, high values by P2
2 7 3 4 8
6
P1 2
2 8 12 5 P1 MOVES
P2 7
1 5 9 7 8 4

6 1 0 P2 MOVES

3 6 5 1

4 8 1 GOAL
Game Tree Formulation
• At each step, P1 tries to reach a vertex with
8 P1 MOVES least value of heuristic function
4 • At next step, P2 tries to reach a vertex with
2 3 maximum value of heuristic function!
P2 MOVES • Each leaf node characterized by a value
6 9 7
• Low values preferred by P1, high values by P2
2 7 3 4 8
6
P1 2 9
2 8 12 5 P1 MOVES
P2 7 5
1 5 9 7 8 4

6 1 0 P2 MOVES

3 6 5 1

6 1 0 P2 MOVES
P2 wins the game, but
3 6 5 1 pays higher cost (12)
compared to P1 (11)!
4 8 1 GOAL
Optimal Path Identification
• Each leaf vertex already has a “utility” measure
8 P1 MOVES • Use it to update the heuristic values at the
4 remaining vertices
2 3 • Consider each pre-leaf vertex
P2 MOVES • If P2 makes move, they will surely move to the
6 9 7
leaf with higher value
2 7 3 4 8 • Copy same value to pre-leaf vertex!
6

2 8 12 5 P1 MOVES

1 5 9 7 8 4

8 8 0 P2 MOVES

3 6 5 1

4 8 1 GOAL
Optimal Path Identification
• Each leaf vertex already has a “utility” measure
0 P1 MOVES • Use it to update the heuristic values at the
4 remaining vertices
2 3 • Consider each pre-leaf vertex
P2 MOVES • If P2 makes move, they will surely move to the
8 8 0
leaf with higher value
2 7 3 4 8 • Copy same value to pre-leaf vertex!
6

8 8 0 0 P1 MOVES

1 5 9 7 8 4 • In previous round, P1 will

surely move to the pre-
8 8 0 P2 MOVES
leaf node with lower value!
3 1 • Accordingly update
6 5
estimates!
4 8 1 GOAL • Extend same logic to
previous layers!
• MIN-MAX Algorithm!
Optimal Path Identification
• Updated heuristics show us the optimal path
0 P1 MOVES for both players!
4 • No wonder what P2 does, P1 will win if they
2 3 make the first move correctly!
8 8 0 P2 MOVES

2 7 3 4 8
6

8 8 0 0 P1 MOVES

1 5 9 7 8 4

8 8 0 P2 MOVES

3 6 5 1

4 8 1 GOAL
Optimal Path Identification
• But if P2 moves first, result can be quite
4 P2 MOVES different!
2 4
3

4 1 0 P1 MOVES

2 7 3 4 8
6

4 4 1 0 P2 MOVES

1 5 9 7 8 4

4 1 0 P1 MOVES

3 6 5 1

4 8 1 GOAL
Alpha-Beta Pruning
• Is it really necessary to evaluate the entire
game tree?
• Sometimes, we already know that a particular
move can never give the best result!
• Prune the unnecessary parts!
• Alpha-pruning: prune subtrees with high
values!
• Beta-pruning: prune substrees with low values!

Source: wikipedia
Multi-objective Heuristic Search
• Edge cost c(x,y) is vector-valued!
• Heuristic function h(x), estimated path costs f(x) and g(x) also vector-valued!
• Need to change A* algorithm accordingly!

• When comparing f(y) values of open list, we may find a ‘non-dominating’ set of vertices
• Eg. f(y1) = [3 4 8], f(y2) = [2 5 7], f(y3) = [4 5 8] : {y1, y2} are non-dominating!

• We may shift to any one vertex from non-dominating set

• But cannot put it in CLOSE list [other paths haven’t been explored]!

• Path to goal vertex is guaranteed to be non-dominated!

Multi-objective Heuristic Search
Expand n1

Expand n2

Expand n4

Expand n7

Dasgupta, Chakrabarti, De Sarkar

Multi-objective Game Trees

Max-dominance Min-dominance

Non-dominance

Dasgupta, Chakrabarti, De Sarkar

Applications in Economics
• Game trees can represent strategic interactions among relevant stakeholders in any
market scenario, including competing suppliers and consumers. They allow
economists to study market equilibriums, and different players to plan their strategies

• In behavioral economics, game trees allow economists to predict how rational agents
will behave in strategic situations, such as oligopoly competition, bargaining
situations, or negotiations where each player's decision depends on their
expectations of others' actions.

• Macroeconomic policy-making is a context requiring multi-objective optimization.

— A bank’s monetary policy must balance low inflation, low unemployment, low
balance of trade deficit, etc.
Lecture – 10
Topic: Clustering and Segmentation
• Feature-based representation

• Clusters in Feature Space

• Agglomerative Clustering

• K-means Clustering

Do not write here

(Your Video will
appear here)
Product Ratings based on Features
We can identify which ones are “similar” and group them together!
Feature Selection
Decision Tree for Feature Selection
 Each observation represented with a “feature vector” in a suitable vector space!
 Calculate distances between every pair of examples
 Examples which are in same cluster, should be closer than examples that are in different clusters!!
 How to partition the observations into such clusters?
Mean Distance 6

Mean Distance 2 Mean Distance 1.5

Initially, consider each point as a separate cluster
Keep merging two clusters if they are close enough!
Write your content here
Write your content here
Agglomerative Clustering
Cluster 1: [Xa1, Xa2, …., Xam], Cluster 2: [Xb1, Xb2, …., Xbn]
Are these two clusters “close”?
Points are close based on Euclidean Distance, but what about point sets ??

We need a measure of distance between point sets

Compute distances between m*n pairs of points from the two sets

Single linkage: minimum distance between two points from the two sets
Multiple linkage: maximum distance between two points from the two sets
Average linkage: mean distance between two points from the two sets

Initially, consider each point as a separate cluster

Merge a pair of clusters if they are closer than a threshold (choose any criteria)!
Repeat till no more mergers possible!
Agglomerative Clustering (Threshold: 5)

3
4

2
Agglomerative Clustering (Threshold: 5)

TERMINATION!!

TOO HIGH!
Prototype-based Clustering
K-means Clustering (MacQueen, 1967)
Calculate Distances to Initial Clusters
Assignment to Clusters
Re-estimate Cluster Centers
Stopping Criteria
• No change of cluster assignment for any data-point
• No (or negligible) change of cluster centers
• Intra-cluster distance at local minima
Stopping Criteria
• No change of cluster assignment for any data-point
• No (or negligible) change of cluster centers
• Intra-cluster distance at local minima
Heuristics
• K-means provides only a local optima
• Success of K-means clustering often depends on choice of initial clusters
• Choice of K (number of clusters) is empirical
Applications of Clustering in Economics
• Market Segmentation: The aim here is to identify groups of consumers, who may
behave as a collective. Such groups may be formed on the basis of multiple attributes
including age, gender, demographics etc.

• Economic Policy Design/Analysis: Impacts of policies at Local, regional, national levels

may be predicted from experience in similar contexts, such as GDP, employment rates,
income levels, socio-economic characteristics, infrastructure etc. Clustering
countries/regions/localities on these bases helps in this

• Labour Market Analysis: Clustering of workers in labour force based on skills, education,
experience, and other relevant factors. This helps in understanding labor market trends,
wage differentials, and workforce mobility, also identifying sets of workers to advertise
specific job opportunities.
• Clustering: partition examples into coherent and distinct groups

• Requires distance measure to compare individual or sets of observations

• Agglomerative clustering: initially consider all examples as separate clusters, then

progressively merge clusters with low distance

• K-means: highly popular iterative clustering algorithm based on Euclidean distance

• Sensitive to initialization, number of clusters (K) needs to be pre-specified

Optmizationtechniques 150308051251 Conversion Gate01
No ratings yet
Optmizationtechniques 150308051251 Conversion Gate01
18 pages
NEOM UNIT-1 Sept-23
No ratings yet
NEOM UNIT-1 Sept-23
34 pages
Lec 05
No ratings yet
Lec 05
5 pages
Introduction To Optimization
No ratings yet
Introduction To Optimization
18 pages
Linear Programming for Optimization Problems
No ratings yet
Linear Programming for Optimization Problems
29 pages
123
No ratings yet
123
21 pages
Theory Note 1
No ratings yet
Theory Note 1
5 pages
Optimisation
No ratings yet
Optimisation
38 pages
LINEAR PROGRAMMING - Lesson 2 PDF
No ratings yet
LINEAR PROGRAMMING - Lesson 2 PDF
39 pages
Linear Programming Fundamentals
100% (1)
Linear Programming Fundamentals
19 pages
7-FENG 346 Linear Programming - B - UT
No ratings yet
7-FENG 346 Linear Programming - B - UT
54 pages
Lec 41
No ratings yet
Lec 41
22 pages
BYJU'S Answer: Study Materials
No ratings yet
BYJU'S Answer: Study Materials
13 pages
Wisdom of Crowds Intro
No ratings yet
Wisdom of Crowds Intro
53 pages
Classical Optimization Techniques
No ratings yet
Classical Optimization Techniques
48 pages
Graphical Method for LPP Solutions
No ratings yet
Graphical Method for LPP Solutions
19 pages
Lec1 Coen505
No ratings yet
Lec1 Coen505
15 pages
1 Mathematical Preliminaries 2
No ratings yet
1 Mathematical Preliminaries 2
17 pages
A New Algorithm For Linear Programming: Dhananjay P. Mehendale Sir Parashurambhau College, Tilak Road, Pune-411030, India
No ratings yet
A New Algorithm For Linear Programming: Dhananjay P. Mehendale Sir Parashurambhau College, Tilak Road, Pune-411030, India
20 pages
Mathematics in The Modern World - Finals - Reviewer
No ratings yet
Mathematics in The Modern World - Finals - Reviewer
19 pages
Optimization - Introduction
No ratings yet
Optimization - Introduction
52 pages
Lec 05 - Week 05 Operations Research
No ratings yet
Lec 05 - Week 05 Operations Research
33 pages
Lec 04
No ratings yet
Lec 04
10 pages
Optimization Lecture Notes
No ratings yet
Optimization Lecture Notes
13 pages
Linear Programming for Managerial Decisions
No ratings yet
Linear Programming for Managerial Decisions
29 pages
Lecture Notes LPP Graphical Method
No ratings yet
Lecture Notes LPP Graphical Method
22 pages
Linear Programming - Graphical Method
No ratings yet
Linear Programming - Graphical Method
6 pages
Linear Programming in Operations Research
No ratings yet
Linear Programming in Operations Research
16 pages
B For I 1,, M: N J J J
No ratings yet
B For I 1,, M: N J J J
19 pages
Linear Programming
No ratings yet
Linear Programming
56 pages
Linear Programming: Graphical Method Guide
No ratings yet
Linear Programming: Graphical Method Guide
6 pages
Understanding Linear Programming Basics
No ratings yet
Understanding Linear Programming Basics
58 pages
Linear Programming Essentials
No ratings yet
Linear Programming Essentials
19 pages
Linear Programming: by Nagendra Bahadur Amatya Head of Department
No ratings yet
Linear Programming: by Nagendra Bahadur Amatya Head of Department
29 pages
Linear Programming in FENG 346
No ratings yet
Linear Programming in FENG 346
43 pages
Optimization Techniques Overview
No ratings yet
Optimization Techniques Overview
6 pages
Operations Research and Analysis
No ratings yet
Operations Research and Analysis
46 pages
Linear Programming I: Formulation and Graphic Solution: X X X Objective Function X X X X X X
No ratings yet
Linear Programming I: Formulation and Graphic Solution: X X X Objective Function X X X X X X
3 pages
Process Optimization
100% (1)
Process Optimization
70 pages
Power System Optimization Techniques
No ratings yet
Power System Optimization Techniques
49 pages
35 Mathematics Linear Programming
No ratings yet
35 Mathematics Linear Programming
9 pages
Summary 2
No ratings yet
Summary 2
40 pages
Op Tim Ization
No ratings yet
Op Tim Ization
4 pages
Linear Programming for Managers
No ratings yet
Linear Programming for Managers
21 pages
Introduction To Linear Programming
No ratings yet
Introduction To Linear Programming
34 pages
Master Theory SELF Theory JEE LINEAR PROGRAMMING
No ratings yet
Master Theory SELF Theory JEE LINEAR PROGRAMMING
15 pages
Chapter Three: Optimization Techniques
No ratings yet
Chapter Three: Optimization Techniques
16 pages
Kelompok 2 - Linear Programming (English)
No ratings yet
Kelompok 2 - Linear Programming (English)
37 pages
Graphical Solutions in Linear Programming
No ratings yet
Graphical Solutions in Linear Programming
31 pages
Optimization Methods 1755746982
No ratings yet
Optimization Methods 1755746982
116 pages
Optimization For Data Science
No ratings yet
Optimization For Data Science
18 pages
CH 4-Design Optimization-Optimum Design Concepts PDF
No ratings yet
CH 4-Design Optimization-Optimum Design Concepts PDF
62 pages
Cheatsheet
No ratings yet
Cheatsheet
2 pages
Midterm Resume
No ratings yet
Midterm Resume
24 pages
Introduction to Linear Programming Concepts
No ratings yet
Introduction to Linear Programming Concepts
33 pages
AI Viva Questions (All 10 Experiments)
No ratings yet
AI Viva Questions (All 10 Experiments)
6 pages
Slot 14 14.2 Breadth-First Search
No ratings yet
Slot 14 14.2 Breadth-First Search
16 pages
Topological Sort in Delphi
No ratings yet
Topological Sort in Delphi
8 pages
Data Structure and Algorithm Syllabus
No ratings yet
Data Structure and Algorithm Syllabus
3 pages
Viva Questions and Answers On Design and Analysis of Algorithms
No ratings yet
Viva Questions and Answers On Design and Analysis of Algorithms
14 pages
DSUC 4thModuleNotes
No ratings yet
DSUC 4thModuleNotes
9 pages
Publicis Sapient Pyqs
No ratings yet
Publicis Sapient Pyqs
44 pages
Leetcode Patterns Sheet - Google Sheets
No ratings yet
Leetcode Patterns Sheet - Google Sheets
1 page
GATE Math Graph Theory Notes
No ratings yet
GATE Math Graph Theory Notes
25 pages
Search & Constraint Satisfaction
No ratings yet
Search & Constraint Satisfaction
10 pages
Advanced - Algorithm - Lab - II - Experiment - CO Mapping
No ratings yet
Advanced - Algorithm - Lab - II - Experiment - CO Mapping
41 pages
AI-ML LAB Manual Format
No ratings yet
AI-ML LAB Manual Format
62 pages
II ND Year III Sem. Scheme Syllabus
No ratings yet
II ND Year III Sem. Scheme Syllabus
20 pages
MCS31 combinedQPapers
No ratings yet
MCS31 combinedQPapers
23 pages
Classic Computer Science Problems in Swift Essential Techniques For Practicing Programmers 1st Edition David Kopec Digital Download
No ratings yet
Classic Computer Science Problems in Swift Essential Techniques For Practicing Programmers 1st Edition David Kopec Digital Download
92 pages
BSC MPCs 2nd Year 2nd Sem 2nd Internal DAA
No ratings yet
BSC MPCs 2nd Year 2nd Sem 2nd Internal DAA
1 page
Unit 1 Introduction To AI and Intelligent Agents - Master
No ratings yet
Unit 1 Introduction To AI and Intelligent Agents - Master
75 pages
hw07 Solution PDF
No ratings yet
hw07 Solution PDF
7 pages
Master Competitive Programming
No ratings yet
Master Competitive Programming
16 pages
Introduction To Artificial Intelligence
No ratings yet
Introduction To Artificial Intelligence
102 pages
Top 100 Data Structures and Algorithms MCQs With Answers
100% (2)
Top 100 Data Structures and Algorithms MCQs With Answers
9 pages
Prajan AI
No ratings yet
Prajan AI
25 pages
Deloitte Recruitment Process - Complete Guide
No ratings yet
Deloitte Recruitment Process - Complete Guide
8 pages
General and Twisted Question Bank-AI
No ratings yet
General and Twisted Question Bank-AI
5 pages
Master The Coding Interview - Data Structures + Algorithms
No ratings yet
Master The Coding Interview - Data Structures + Algorithms
1 page
Chapter 11B - Minimum Spanning Tree (26 Files Merged)
No ratings yet
Chapter 11B - Minimum Spanning Tree (26 Files Merged)
463 pages
ADA Solved
No ratings yet
ADA Solved
14 pages
2425 CSC14003 23CLC1 Quiz01
No ratings yet
2425 CSC14003 23CLC1 Quiz01
8 pages
DAA Unit-3 Compressed
No ratings yet
DAA Unit-3 Compressed
94 pages
Graph Implementation and Traversal Methods
No ratings yet
Graph Implementation and Traversal Methods
8 pages

Week 2

Uploaded by

Week 2

Uploaded by

Lecture – 06

Topic: Unconstrained Optimization

• Convex and nonconvex optimization

• Unconstrained Optimization: gradient descent

• Multicriteria Optimization and Pareto Optimality

• Utility function: F(x) = f1(x) + f2(M-x)

• Utility function: F(x) = f1(x) + f2(M-x)

• Simplest case: f1 and f2 are smooth differentiable functions, ‘x’ is continuous

Define utility function F(x1,x2) = f1(x1) + f2(x2)

Certain combinations of (x1, x2) may provide same

We usually cast optimization problems as minimization (flip signs if needed)!

In some cases, analytical approach does not work

In such cases, we need numerical approach!

1) Start with any initial point x0

Above approach works for vector-valued ‘x’ also!

• Alternative: Newton-Raphson’s Method

Convex functions have unique minima

Non-convex function: can have multiple minima!

1 dominates 2 The non-dominated set

Credits: S.D. Sudhoff

• Objective of multi-objective optimization:

• Possible Solution: attach weights to each objective to define utility function!

• Exterior Point Methods

• Interior Point Methods

• However, not all possible solutions are acceptable!

• Set of solutions that satisfy all constraints: feasible set!

• We convert problem constraints into system of

• Characterize the solutions of the system of

• A certain solution of the

Slack variables: s1, s2!

For small (eg. 2D) systems, we can solve graphically

In general, we try to find basic feasible solutions, and

Simplex algorithm: achieves the same through

• We now have unconstrained minimization problem with additional variables

Solution correctness guaranteed by Lagrange Multiplier Theorem!

• Augmented objective: F(x) = f(x) + Σiλi*hi(x) + Σjμj*(gj(x) + sj2)

• Karush-Kahn-Tucker (KKT) Conditions for optimality of solution:

• Shortest Path Algorithms

• Examples and properties

Directed Graph Undirected Graph

Directed Graph Undirected Graph

Path between every pair of vertices: graph is connected!

Breadth-First Search Depth-First Search

5 A->C->B path cost: 6

Estimated path cost for node x: f(x) = g(x) + h(x)

h(x) g(x) f(x) Prev(x)

h(x) g(x) f(x) Prev(x)

h(x) g(x) f(x) Prev(x)

h(x) g(x) f(x) Prev(x)

Path found: A->C->F->H (cost 9)! F 4 5 9 C

• Portfolio Optimization: Selecting the best combination of assets to maximize returns

• A*: well-known heuristic search algorithm (Best-First Search)

• Guaranteed to give right solution is heuristic is admissible -> always UNDERESTIMATES

• Extending to Multi-objective case

1 5 9 7 8 4 • In previous round, P1 will

• We may shift to any one vertex from non-dominating set

• Path to goal vertex is guaranteed to be non-dominated!

Dasgupta, Chakrabarti, De Sarkar

Dasgupta, Chakrabarti, De Sarkar

• Macroeconomic policy-making is a context requiring multi-objective optimization.

• Clusters in Feature Space

Do not write here

Mean Distance 2 Mean Distance 1.5

We need a measure of distance between point sets

Initially, consider each point as a separate cluster

• Economic Policy Design/Analysis: Impacts of policies at Local, regional, national levels

• Requires distance measure to compare individual or sets of observations

• Agglomerative clustering: initially consider all examples as separate clusters, then

• K-means: highly popular iterative clustering algorithm based on Euclidean distance

• Sensitive to initialization, number of clusters (K) needs to be pre-specified

You might also like

• Augmented objective: F(x) = f(x) + Σiλihi(x) + Σjμj(gj(x) + sj2)