Ch02 (1) (1) - Merged
Ch02 (1) (1) - Merged
Ch 02 Intelligent Agents
part1
1
Outline
2
An agent is any entity that can be viewed as:
• perceiving its environment through sensors, and
• acting upon that environment through actuators.
3
Agents and Environments [cont.]
Agents include humans, robots, softbots, thermostats, etc.
• human:
perceives: with eyes, ears, nose, hands, ...,
acts: with voice, hands, arms, legs, ...
• robot:
perceives: with video-cameras, infra-red sensors, radar, ...
acts: with wheels, motors,
• softbot:
perceives: receiving keystrokes, files, network packets, ...
acts: displaying on the screen, writing files, sending network packets
• thermostat:
perceives: with heat sensor, ...
acts: electric impulses to valves, devices, ...
4
Agents and Environments [cont.]
▪
▪
▪
▪
5
Key concepts
Percept and Percept sequences
percept: the collection of agent’s perceptual inputs at any given
instant
percept sequence: the complete history of everything the agent has
ever perceived
Agent program
Internally, the agent function for an artificial agent is implemented by an agent program.
8
Example : Vacuum cleaner [cont.]
A simple agent function
If the current square is dirty, then suck; otherwise, move to the other square.
⚫ Agent’s function -> look-up table
⚪ For many agents this is a very large table
Percept sequence Action
[A,Clean] Right
[A,Dirty] Suck
[B,Clean] Left
[B,Dirty] Suck
[A,Clean],[B,Clean] Left
[A,Clean],[B,Dirty] Suck
etc etc
Note: this agent function depends only on the last percept, not on the whole percept
sequence. 9
Example : Vacuum cleaner [cont.]
Corresponding agent program
function Reflex-Vacuum-Agent([location,status]) returns an action
if status = Dirty then return Suck
else if location = A then return Right
else if location = B then return Left
10
Rational Agents
Intuition: a rational agent is one that “does the right thing”
i.e., every entry in the agent function-table is filled out correctly
What is the right thing?
Approximation: the most “succesfull” thing:
In a given environment, according to the percept sequence it
receives, an agent generates a sequence of actions, ...
... causing the environment to go through a sequence of states.
If such sequence is desirable, then the agent has performed well.
⇒ need a performance measure to evaluate any sequence of
environment states
Performance measure should be objective.
11
Rational Agents [cont.]
12
Rational Agents [cont.]
What is rational at any given time depends on four things:
The performance measure that defines the criterion of success
The agent’s prior knowledge of the environment
The actions that the agent can perform
The agent’s percept sequence to date (from sensors)
Definition of a rational agent
For each possible percept sequence, a rational agent should select an action
that is expected to maximize its performance measure, given the evidence
provided by the percept sequence and whatever built-in knowledge the agent has.
13
Example of Rational Agents: vacuum-cleaner agent
Under the following assumptions:
Performance measure: one point for each clean square at each time step, over 1000
time steps
Environment knowledge:
• “geography” known a priori,
• dirt distribution and agent initial location unknown
• [clean squares cannot become dirty again ]
Perception: self location, presence of dirt
Actions: Left, Right, Suck
Is the agent rational?
⇒ Yes! (provided the given performence measure)
Beware: if a penalty for each move is given, the agent behaves poorly
⇒ better agent: do nothing once it is sure all the squares are clean
14
Rationality vs. Omniscience vs. Perfection
Remark
Rationality ≠ Omniscience!
• An omniscient agent knows for sure the outcome of its actions
⇒ omniscience impossible in reality
• A rational agent may only know “up to a reasonable confidence”
(e.g., when crossing a road, what if something falling from a plane
flattens you? if so, would you be considered irrational?)
17
Task Environments - PEAS
18
PEAS: Vacuum Cleaner
19
PEAS: Automated taxi driver
Performance measure
• safety, destination, profits, comfort, ...
Environment
• streets/freeways, other traffic, pedestrians, ...
Actuators
• steering, accelerator, brake, horn, speaker/display, ...
Sensors
• video, sonar, speedometer, engine sensors, GPS, ...
Remark
Some goals to be measured may conflict!
e.g. profits vs. safety, profits vs. comfort, ...
=⇒ tradeoffs are required 20
▪
▪
▪
▪
▪
▪
▪
▪
21
PEAS: Part-picking robot
Performance measure
• Percentage of parts in correct bins
Environment
• Conveyor belt with parts, bins
Actuators
• Jointed arm and hand
Sensors
• Camera, joint angle sensors
22
PEAS: Interactive English tutor
Performance measure
• Maximize student's score on test
Environment
• Set of students
Actuators
• Screen display (exercises, suggestions, corrections)
Sensors
• Keyboard, microphone
23
Properties of Task Environments
24
Properties of Task Environments [cont.]
Fully observable vs. partially observable
28
Properties of Task Environments [cont.]
Discrete vs. continuous
The state of the environment, the way time is handled,
and agents percepts & actions can be discrete or
continuous
Ex: Crossword puzzles: discrete state, time, percepts & actions
Ex : Chess: : discrete state, time, percepts & actions
Ex: Taxi driving: continuous state, time, percepts & actions
...
29
Properties of Task Environments [cont.]
Single-agent vs. multi-agent
A task environment is multi-agent iff contains other agents who are
also maximizing some performance measure that depends on the
current agent’s actions
latest condition essential
distinction between single- and multi-agent sometimes subtle
Two important cases
competitive multi-agent environment: other agents’ goals conflict with
or oppose to the agent’s goals
Ex: chess, war scenarios, taxi driving (compete for parking lot), ...
cooperative multi-agent environment: other agents’ goals coincide in
full or in part with the agent’s goals
• Ex: ants’ nest, factory, taxi driving (avoid collisions), ...
30
Properties of Task Environments [cont.]
Single-agent vs. multi-agent
different design problems for multi-agent wrt. single-agent
competitive: randomized behaviour often rational
(unpredictable)
collaborative: communication with other agents often
rational
Note
The simplest environment is fully observable, single-agent,
deterministic, episodic, static and discrete.
Ex: simple vacuum cleaner
Most real-world situations are partially observable, multi-agent,
stochastic, sequential, dynamic, and continuous.
Ex: taxi driving
32
Properties of Task Environments [cont.]
Example properties of task Environments
1
Properties of the Agent’s State of Knowledge
Known vs. unknown
Describes the agent’s (or designer’s) state of knowledge about
the “laws of physics” of the environment
if the environment is known, then the outcomes (or outcome
probabilities if stochastic) for all actions are given.
if the environment is unknown, then the agent will have to learn
how it works in order to make good decisions
Orthogonal wrt. task-environment properties
4
Agent types
Four basic types in order of increasing generality:
Simple reflex agents
Reflex agents with state/model
Goal-based agents
Utility-based agents
All these can be turned into learning agents
5
Agent Types: Simple-reflex agent
Select action on the basis of the current percept only
Ex: the simple vacuum-agent
6
Agent Types: Simple-reflex agent [cont.]
Simple but very limited intelligence.
Action does not depend on percept history, only on current
percept.
Therefore no memory requirements.
may work only if the environment is fully observable
• deadlocks or infinite loops may occur otherwise
• ⇒ limited applicability
7
Agent Types: Model-based agent
8
Agent Types: Model-based Agent [cont.]
10
Agent Types: Goal-based Agent [cont.]
Goal-based Agents
more are flexible:
• the knowledge that supports its decisions is represented explicitly
• such knowledge can be modified
⇒ all of the relevant behaviors to be altered to suit the new conditions
o Ex: If it rains, the agent can update its knowledge of how effectively its
brakes operate
• the goal can be modified/updated ⇒ modify its behaviour
o no need to rewrite all rules from scratch
more complicate to implement
may require expensive computation (search, planning)
11
Agent Types: Utility-based agent
Goals alone often not enough to generate high-quality
behaviors
Certain goals can be reached in different ways, of different
quality
• Ex: some routes are quicker, safer, or cheaper than others
Idea: Add utility function(s) to drive the choice of actions
• maps a (sequence of) state(s) onto a real number
⇒ actions are chosen which maximize the utility
function under uncertainty, maximize the expected
utility function
⇒ utility function = internalization of performance measure
12
Agent Types: Utility-based Agent [cont.]
Utility-based Agents
advantages wrt. goal-based:
with conflicting goals, utility specifies and appropriate tradeoff
with several goals none of which can be achieved with certainty,
utility selects proper tradeoff between importance of goals and
likelihood of success
still complicate to implement
require sophisticated perception, reasoning, and learning
may require expensive computation
13
Artificial Intelligence
Ch 02 Intelligent Agents
Part2[continue]
1
Agent Types: Learning
Problem
Previous agent programs describe methods for selecting actions
How are these agent programs programmed?
Programming by hand inefficient and ineffective!
Solution: build learning machines and then teach them (rather than instruct
them)
Advantage: robustness of the agent program toward initially-unknown
environments
2
Agent Types: Learning
Performance element: selects actions based on percepts
Corresponds to the previous agent programs
3
Agent Types: Learning
Learning element: introduces improvements
uses feedback from the critic on how the agent is doing
determines improvements for the performance element
4
Agent Types: Learning
Critic tells how the agent is doing wrt. performance standard
5
Agent Types: Learning
Problem generator: suggests actions that will lead to new and
informative experiences
forces exploration of new stimulating scenarios
6
Learning Agent Types: Example
Taxi Driving
After the taxi makes a quick left turn across three lanes, the critic
observes the shocking language used by other drivers.
From this experience, the learning element formulates a rule
saying this was a bad action.
The performance element is modified by adding the new rule.
The problem generator might identify certain areas of behavior in
need of improvement, and suggest trying out the brakes on
different road surfaces under different conditions.
7
Artificial Intelligence
Ch 03 Problem Solving by Search
Part1
1
Types ofAgents
a Reflex Agent : a Planning agent :
X X X
X X X
X X X
O O O
X X X O X X O X X X
O O O
Example: MechanicalFault Diagnosing
Start ask:
What is the problem?
Engine trouble
ask:
Transmission
ask:………
breaks
ask: ……
…
Does the car start?
Yes No
Yes No battery
Yes
ok
Turn over Won’t turn over ask: No
ask: …….. Do lights come on? battery
dead
Search
We will consider the problem of designing goal-based agents in
fully observable, deterministic, discrete, known environments.
Start State
Goal State
Search
The agent must find a sequence of actions that reaches the goal.
The performance measure is defined by:
(a) reaching the goal, and ..
(b) how “expensive” the path to the goal is.
Problem Solving as Search
One of the dominant approaches to AI problem solving: formulate a problem/task
as search in a state space.
Main Paradigm
1 Goal formulation: define the successful world states
Ex: a set of states, a Boolean test function ...
2 Problem formulation:
define a representation for states
define legal actions and transition functions
3 Search: find a solution by means of a search process
solutions are sequences of actions
4
Execution: given the solution, perform the actions
Note
The agent is assumed to have no heuristic knowledge about traveling
in Romania to exploit.
Problem Solving as Search: Example [cont.]
The choice of a good abstraction involves removing as much detail as possible, while retaining
validity and ensuring that the abstract actions are easy to carry out.
Toy Example: Simple Vacuum Cleaner
States: 2 locations, each {clean, dirty}: 2 · 22 = 8 states
Initial State: any
Actions: { Left, Right, Suck }
Transition Model: Left [Right] if A [B], Suck if clean ⇒ no effect
Goal Test: check if squares are clean
Path Cost: each step costs 1 ⇒ path cost is # of steps in path
Toy Example: The 8-Puzzle
States: Integer location of each tile ⇒ 9!/2 reachable
states Initial State: any
Actions: moving { Left, Right, Up, Down} the empty space
Transition Model: empty switched with the tile in target location
Goal Test: checks state corresponds with goal configuration
Path Cost: each step costs 1 ⇒ path cost is # of steps in path
Toy Example: The 8-Puzzle [cont.]
21 / 96
Toy Example: 8-Queens Problem (incremental)
States: n ≤ 8 queens on board, one per column in the n leftmost
columns, no queen attacking another
⇒ 2057 possible sequences
Actions: Add a queen to any square in the leftmost empty
column such that it is not attacked by any other queen.
...
22 / 96
Real-World Example: Robotic Assembly
States: real-valued coordinates of robot joint angles, and of parts of the
object to be assembled
Initial State: any arm position and object configuration
Actions: continuous motions of robot joints
Transition Model: position resulting from motion
Goal Test: complete assembly (without robot)
Path Cost: time to execute
23 / 96
Artificial Intelligence
Ch 03 Problem Solving by Search
Part 2
1
Searching for Solutions
Moral: Algorithms that forget their history are doomed to repeat it!
Repeated states & Redundant Paths
redundant paths occur when there is more than one way to
get from one state to another
⇒ same state explored more than once
Failure to detect repeated states can:
• cause infinite loops
• turn linear problem into exponential
Moral: Algorithms that forget their history are doomed to repeat it!
Graph Search Algorithms
Graph Search: Basic idea
add a data structure which remembers every expanded node
• a.k.a. explored set or closed list
• typically a hash table (access O(1))
do not expand a node if it already occurs in explored set
Graph Search Algorithms [cont.]
Initialize the frontier using the starting state
While the frontier is not empty
• Choose a frontier node according to search strategy
and take it off the frontier.
• If the node contains the goal state, return solution
• Else expand the node and add its children to the frontier.
Note: the goal test is applied to each node when it is generated, rather than when it is
selected for expansion ⇒ solution detected 1 layer earlier
Breadth-First Search: Tiers