Intelligent Agents
Module 2
Instructor: Andrew J
Department of CSE, MIT, Manipal
Email: [email protected]
Agent and Environment
Sensors
percepts
? Environment
Agent
actions
Actuators
Agent and Environment
• Anything that can be viewed as perceiving its environment through
sensors and acting upon that environment through its
effectors/actuators.
• Example:
• Human agent
• Robotic agent
• Software agent
Simple Terms -- [PAGE]
• Percept
• Agent’s perceptual inputs at any given instant
• Percept sequence
• Complete history of everything that the agent has ever
perceived.
• Action
• An operation involving an actuator
• Actions can be grouped into action sequences
A Windshield Wiper Agent
How do we design a agent that can wipe the windshields when
needed?
• Goals?
• Percepts?
• Sensors?
• Effectors?
• Actions?
• Environment?
A Windshield Wiper Agent (Cont’d)
• Goals: Keep windshields clean & maintain visibility
• Percepts: Raining, Dirty
• Sensors: Camera (moist sensor)
• Effectors: Wipers (left, right, back)
• Actions: Off, Slow, Medium, Fast
• Environment: Inner city, highways, weather
Interacting Agents
Collision Avoidance Agent (CAA)
• Goals: Avoid running into obstacles
• Percepts ?
• Sensors?
• Effectors ?
• Actions ?
• Environment: Freeway
Lane Keeping Agent (LKA)
• Goals: Stay in current lane
• Percepts ?
• Sensors?
• Effectors ?
• Actions ?
• Environment: Freeway
Interacting Agents
Collision Avoidance Agent (CAA)
• Goals: Avoid running into obstacles
• Percepts: Obstacle distance, velocity, trajectory
• Sensors: Vision, proximity sensing
• Effectors: Steering Wheel, Accelerator, Brakes, Horn, Headlights
• Actions: Steer, speed up, brake, blow horn, signal (headlights)
• Environment: Highway
Lane Keeping Agent (LKA)
• Goals: Stay in current lane
• Percepts: Lane center, lane boundaries
• Sensors: Vision
• Effectors: Steering Wheel, Accelerator, Brakes
• Actions: Steer, speed up, brake
• Environment: Highway
Agent function & program
• Agent’s behavior is described by
• Agent function
• A function mapping any given percept sequence to an action
• Abstract mathematical description
• Practically it is described by
• An agent program
• Concrete implementation, running within some physical system
Vacuum-
cleaner world
• Perception: Clean or Dirty?
where it is in?
• Actions: Move left, Move
right, suck, do nothing
Vacuum-cleaner world
Function Reflex-Vacuum-
Agent([location,status]) return an action
Program If status = Dirty then return Suck
implements
the agent else if location = A then return Right
function
else if location = B then return left
Behavior and performance of Agents in
terms of agent function
• Perception (sequence) to Action Mapping:
• Ideal mapping: specifies which actions an agent ought to take
at any point in time
• Description: Look-Up-Table
• Rational Agent : One that does the right thing.
• Performance measure: a subjective measure to characterize how
successful an agent is (e.g., speed, power usage, accuracy, money,
etc.
A general rule:
• Design performance measures according to
• What one actually wants in the
environment
Performance • Rather than how one thinks the agent
should behave
measure
E.g., in vacuum-cleaner world
• We want the floor clean, no matter how
the agent behave
• We don’t restrict how the agent behaves
Agents are autonomous, that is, they act on
behalf of the user
How is an
Agent Agents contain some level of intelligence,
from fixed rules to learning engines that
different allow them to adapt to changes in the
from other environment
Agents not only act reactively, but
software? sometimes also proactively
Agents have social ability, that is, they
communicate with the user, the
How is an system, and other agents as required
Agent Agents may also cooperate with other
different agents to carry out more complex
tasks than they themselves can handle
from other
Agents may migrate from one system
software? to another to access remote resources
or even to meet other agents
Rationality
• What is rational at any given time depends on four things:
• The performance measure defining the criterion of success
• The agent’s prior knowledge of the environment
• The actions that the agent can perform
• The agents’s percept sequence up to now
Rational agent
• For each possible percept sequence,
• Rational agent should select
• an action expected to maximize its performance measure, given the
evidence provided by the percept sequence and whatever built-in
knowledge the agent has
• E.g., an exam
• Maximize marks, based on
the questions on the paper & your knowledge
Rational agent
• E.g., Vacuum Cleaner
• Performance Measure: one point for each clean square at each time step
• Geography: a priori Environment
• Actions: Left, Right and Suck
• agent correctly perceives its location and whether that location contains dirt
Omniscience
• An omniscient agent
• Knows the actual outcome of its actions in advance
• No other possible outcomes
• However, impossible in real world
Omniscience
• Rationality maximizes
• Expected performance
• Perfection maximizes
• Actual performance
• Hence rational agents are not omniscient.
Learning
• Does a rational agent depend on only current percept?
• No, the past percept sequence should also be used
• This is called learning
• After experiencing an episode, the agent
• should adjust its behaviors to perform better for the same job next
time.
Autonomy
• If an agent just relies on the prior knowledge of its designer rather
than its own percepts then the agent lacks autonomy
A rational agent should be autonomous- it should learn what it can
to compensate for partial or incorrect prior knowledge.
Nature of Environments
• Task environments are the problems
• While the rational agents are the solutions
• Specifying the task environment through PEAS
(Performance, Environment, Actuators, Sensors)
• In designing an agent, the first step must always be to specify the task
environment as fully as possible.
• Eg: Automated taxi driver
Task environments
• Performance measure
• How can we judge the automated driver?
• Which factors are considered?
• getting to the correct destination
• minimizing fuel consumption
• minimizing the trip time and/or cost
• minimizing the violations of traffic laws
• maximizing the safety and comfort, etc.
Task environments
• Environment
• A taxi must deal with a variety of roads
• Traffic lights, other vehicles, pedestrians, stray animals, road
works, police cars, etc.
• Interact with the customer
Task environments
• Actuators (for outputs)
• Control over the accelerator, steering, gear shifting and braking
• A display to communicate with the customers
• Sensors (for inputs)
• Detect other vehicles, road situations
• GPS (Global Positioning System)
• Odometer, engine sensors……
Properties of task
environments
• Fully observable vs. Partially observable
• If an agent’s sensors give it access to the complete state of the
environment at each point in time then the environment is fully
observable
• An environment might be Partially observable because of noisy
and inaccurate sensors or because parts of the state are simply
missing from the sensor data.
• Fully observable environments are convinient because the agent
need not manitain any internal state to keep track of the world.
Properties of task
environments
• Single agent VS. Multi agent
• Playing a crossword puzzle – single agent
• Chess playing – two agents
• Competitive multiagent environment
• Chess playing
• Cooperative multiagent environment
• Automated taxi driver
• Avoiding collision
Properties of task
environments
• Deterministic vs. Stochastic
• next state of the environment Completely
determined by the current state and the actions
executed by the agent, then the environment is
deterministic, otherwise, it is Stochastic.
• Environment is uncertain if it is not fully
observable or not deterministic
• Outcomes are quantified in terms of probability
-taxi driver is Stochastic
- Vacuum cleaner may be deterministic or
stochastic
Properties of task environments
• Episodic vs. Sequential
• An episode = agent’s single pair of perception & action
• The quality of the agent’s action does not depend on other episodes
• Every episode is independent of each other
• Episodic environment is simpler
• The agent does not need to think ahead
• Sequential
• Current action may affect all future decisions
-Ex. Taxi driving and chess.
Properties of task environments
• Static vs. dynamic
• A dynamic environment is always changing over time
• E.g., the number of people in the street
• While static environment
• E.g., the destination
• Semidynamic
• environment is not changed over time
• but the agent’s performance score does
• E.g., chess when played with a clock
Properties of task environments
• Discrete vs. Continuous
• If there are a limited number of distinct states, clearly defined percepts
and actions, the environment is discrete
• E.g., Chess game, Taxi driving
Properties of task environments
• Known vs. Unknown
• This is more about agent’s state of knowledge about the
environment.
• In known environment, the outcomes for all actions are given. ( example:
solitaire card games).
• If the environment is unknown, the agent will have to learn how it works
in order to make good decisions.( example: new video game).
Properties of task environments
• Fully observable vs. Partially observable
• Single agent VS. multiagent
• Deterministic vs. stochastic
• Episodic vs. sequential
• Static vs. dynamic
• Discrete vs. continuous
• Known vs. unknown
Environment example
Crossword puzzle:
Observable:
Agents:
Deterministic:
Episodic:
Static:
Discrete:
NC
Environment example
Crossword puzzle:
Observable: Fully
Agents: Single
Deterministic: Deterministic
Episodic: Sequential
Static: Static
Discrete: Discrete
Environment example
Taxi driving:
Observable:
Agents:
Deterministic:
Episodic:
Static:
Discrete:
Environment example
Taxi driving:
Observable: Partially
Agents: Multi
Deterministic: Stochastic
Episodic: Sequential
Static: Dynamic
Discrete: Continuous
Structure of agents
• Agent = architecture + program
• Architecture = some sort of computing device (sensors + actuators)
• (Agent) Program = some function that implements the agent mapping = “?”
• Agent Program = Job of AI
Agent programs
• Skeleton design of an agent program
Table-driven agents
• Table lookup of percept-action pairs mapping from
every possible perceived state to the optimal action for
that state
• Problems
• Too big to generate and to store (Chess has about
10120 states, for example)
• Not adaptive to changes in the environment;
requires entire table to be updated if changes occur
Types of agent programs
• Simple reflex agents
• Model-based reflex agents
• Goal-based agents
• Utility-based agents
• Learning agents
Simple reflex agent architecture
A Simple Reflex Agent in Nature
percepts
(size, motion)
RULES:
(1) If small moving object,
then activate SNAP
(2) If large moving object,
then activate AVOID and inhibit SNAP
ELSE (not moving) then NOOP
Action: SNAP or AVOID or NOOP
Simple Vacuum Reflex Agent
function Vacuum-Agent([location,status])
returns Action
if status = Dirty then return Suck
else if location = A then return Right
else if location = B then return Left
Simple reflex agents
• Rule-based reasoning to map from percepts to optimal action; each
rule handles a collection of perceived states
• Problems
• Still usually too big to generate and to store
• Still no knowledge of non-perceptual parts of state
• Still not adaptive to changes in the environment; requires
collection of rules to be updated if changes occur
(2) Model-based reflex agents
• Encode “internal state” of the world to remember the past as
contained in earlier percepts.
• Requires two types of knowledge
• How the world evolves independently of the agent?
• How the agent’s actions affect the world?
Model-based Reflex Agents
The agent is with memory
(2)Model-based agent architecture
(3) Goal-based agents
• Choose actions so as to achieve a (given or computed) goal.
• A goal is a description of a desirable situation.
• Keeping track of the current state is often not enough -
need to add goals to decide which situations are good
• Deliberative instead of reactive.
• May have to consider long sequences of possible actions
before deciding if goal is achieved – involves consideration
of the future, “what will happen if I do...?”
Example: Tracking a Target
• The robot must keep
the target in view
• The target’s trajectory
is not known in advance
• The robot may not know
all the obstacles in robot target
advance
• Fast decision is required
(3) Architecture for goal-based agent
(4) Utility-based agents
• When there are multiple possible alternatives, how to
decide which one is best?
• A goal specifies a crude distinction between a happy and
unhappy state, but often need a more general performance
measure that describes “degree of happiness.”
• Utility function U: State ® Reals indicating a measure of
success or happiness when at a given state.
• Allows decisions comparing choice between conflicting
goals, and choice between likelihood of success and
importance of goal (if achievement is uncertain).
(4) Architecture for a complete
utility-based agent
Learning Agents
• After an agent is programmed, can it work immediately?
• No, it still need teaching
• In AI,
• Once an agent is done
• We teach it by giving it a set of examples
• Test it by using another set of examples
• We then say the agent learns
• A learning agent
Learning Agents
• Four conceptual components
• Learning element
• Making improvement
• Performance element
• Selecting external actions
• Critic
• Tells the Learning element how well the agent is doing with respect to fixed
performance standard.
(Feedback from user or examples, good or not?)
• Problem generator
• Suggest actions that will lead to new and informative experiences.
Learning Agents
How the components of agent programs work
• atomic representation
• each state of the world is indivisible
• Has no internal structure
• Algorithms: search, game playing, Hidden Markov models, Markov decision
process
• factored representation
• splits up each state into a fixed set of variables or attributes
• each of which can have a value
• Algorithms: constraint satisfaction, propositional logic, planning, Bayesian
networks, machine learning
• structured representation
• Algorithms: relational databases, first-order logic, first-order probability
models, knowledge-based learning, natural language understanding
Summary: Agents
• An agent perceives and acts in an environment, has an architecture, and is implemented by
an agent program.
• Task environment – PEAS (Performance, Environment, Actuators, Sensors)
• An ideal agent always chooses the action which maximizes its expected performance, given
its percept sequence so far.
• An autonomous learning agent uses its own experience rather than built-in knowledge of the
environment by the designer.
• An agent program maps from percept to action and updates internal state.
• Reflex agents respond immediately to percepts.
• Goal-based agents act in order to achieve their goal(s).
• Utility-based agents maximize their own utility function.
• Representing knowledge is important for successful agent design.
• The most challenging environments are not fully observable, nondeterministic, dynamic, and
continuous