0% found this document useful (0 votes)
38 views18 pages

Q Learning

The document discusses Q learning, a type of reinforcement learning that enables agents to learn optimal actions through interaction with their environment to maximize long-term rewards. It outlines the components of Q learning, including states, actions, rewards, and the Q matrix, and provides examples of its application, such as navigating rooms. The Q learning algorithm is explained step-by-step, illustrating how agents update their knowledge to achieve specific goals.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views18 pages

Q Learning

The document discusses Q learning, a type of reinforcement learning that enables agents to learn optimal actions through interaction with their environment to maximize long-term rewards. It outlines the components of Q learning, including states, actions, rewards, and the Q matrix, and provides examples of its application, such as navigating rooms. The Q learning algorithm is explained step-by-step, illustrating how agents update their knowledge to achieve specific goals.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Artificial Intelligence

Q learning

Pham Viet Cuong


Dept. Control Engineering & Automation, FEEE
Ho Chi Minh City University of Technology
Q learning
ü Supervised learning: Classification, regression
ü Unsupervised learning: Clustering
ü Reinforcement learning:
v More general than supervised/unsupervised learning
v Learn from interactive with environment (perform actions and
observe rewards) to achieve a goal
v Goal: Learn a policy to maximize some measure of long-term reward

Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 2
Q learning
ü Examples:

Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 3
Q learning
ü Examples:

Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 4
Q learning
ü Examples: video games

Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 5
Q learning
ü Examples:

Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 6
Q learning
ü Example:
v Put an agent in any
room
v Goal: go to Room 5
with fastest route 0 1

4 3 2

Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 7
Q learning
ü State: Room 0, Room 1, . . ., Room 5
ü Action: Go to Room 0, Go to Room 1, . . ., Go to Room 5
ü Reward: matrix R

Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 8
Q learning
ü Matrix Q: memory of what agent has learned through experience
v Agent starts out knowing nothing
v Q is initialized to zero

Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 9
Q learning
ü Defined:
v States
v Actions
v Rewards matrix R
v Matrix Q
ü Training in progress
v Updating matrix Q

Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 10
Q learning
ü Utilize the Q matrix:
v Step 1: Set current state = initial state.
v Step 2: From current state, find the action with the highest Q value.
v Step 3: Perform action chosen in Step 2
v Step 4: Set current state = next state.
v Step 5: Repeat Steps 2, 3 and 4 until current state = goal state.

0 1

4 3 2

Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 11
Q learning
ü Q learning algorithm:

Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 12
Q learning
ü Q learning algorithm: gamma = 0.8, episode 1, initial state: 1
state = 1 action: go to 5 next_state = 5

100 0
0.8

Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 13
Q learning
ü Q learning algorithm: episode 2, initial state = 3
state = 3 action: go to 1 next_state = 1

0 100
0.8

Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 14
Q learning
ü Q learning algorithm: episode 2, initial state = 3
state = 1 action: go to 5 next_state = 5

100 0
0.8

Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 15
Q learning
ü Q learning algorithm:

Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 16
Q learning
ü Q learning algorithm:

Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 17
Artificial Neural Networks
ü References
v http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture14.pdf
v http://mnemstudio.org/path-finding-q-learning-tutorial.htm

Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 18

You might also like