0% found this document useful (0 votes)

20 views2 pages

10.Q Learning Algorithm

The document outlines the implementation of a Q-learning algorithm to navigate a grid environment with 16 states and 4 possible actions. It describes the initialization of a Q-table, the learning parameters, and the training process over 1000 epochs using an epsilon-greedy strategy for action selection. The final output is the learned Q-table, which reflects the agent's performance in reaching the goal state.

Uploaded by

nayanabmmtech

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views2 pages

10.Q Learning Algorithm

Uploaded by

nayanabmmtech

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

PROGRAM-10

Implement a Q-learning algorithm to navigate a simple grid environment, defining the reward
structure and analyzing agent performance.

import numpy as np

# Define the environment

n_states = 16 # Number of states in the grid world

n_actions = 4 # Number of possible actions (up, down, left, right)

goal_state = 15 # Goal state

# Initialize Q-table with zeros

Q_table = [Link]((n_states, n_actions))

# Define parameters

learning_rate = 0.8

discount_factor = 0.95

exploration_prob = 0.2

epochs = 1000

# Q-learning algorithm

for epoch in range(epochs):

current_state = [Link](0, n_states) # Start from a random state

while current_state != goal_state:

# Choose action with epsilon-greedy strategy

if [Link]() < exploration_prob:

action = [Link](0, n_actions) # Explore

else:

action = [Link](Q_table[current_state]) # Exploit

# Simulate the environment (move to the next state)

# For simplicity, move to the next state

next_state = (current_state + 1) % n_states

# Define a simple reward function (1 if the goal state is reached, 0 otherwise)

reward = 1 if next_state == goal_state else 0

# Update Q-value using the Q-learning update rule

Q_table[current_state, action] += learning_rate * \

(reward + discount_factor *

[Link](Q_table[next_state]) - Q_table[current_state, action])

current_state = next_state # Move to the next state

# After training, the Q-table represents the learned Q-values

print("Learned Q-table:")

print(Q_table)

Learned Q-table:
[[0.48767498 0.48751892 0.48751892 0.46816798]
[0.51334208 0.51330923 0.51334207 0.50923535]
[0.54036009 0.5403255 0.54036003 0.5403587 ]
[0.56880009 0.56880009 0.56880008 0.56880009]
[0.59873694 0.59873694 0.59873694 0.59873694]
[0.63024941 0.63024941 0.63024941 0.63024941]
[0.66342043 0.66342043 0.66342043 0.66342043]
[0.6983373 0.6983373 0.6983373 0.6983373 ]
[0.73509189 0.73509189 0.73509189 0.73509189]
[0.77378094 0.77378094 0.77378094 0.77378094]
[0.81450625 0.81450625 0.81450625 0.81450625]
[0.857375 0.857375 0.857375 0.857375 ]
[0.9025 0.9025 0.9025 0.9025 ]
[0.95 0.95 0.95 0.95 ]
[1. 1. 1. 1. ]
[0. 0. 0. 0. ]]

Class-Work-1 (26-08-2024)
No ratings yet
Class-Work-1 (26-08-2024)
5 pages
Ass1 Merged Merged
No ratings yet
Ass1 Merged Merged
19 pages
Training in FrozenLake Environment
No ratings yet
Training in FrozenLake Environment
6 pages
Intro To Reinforcement Learning - DQ Q AC A3C
No ratings yet
Intro To Reinforcement Learning - DQ Q AC A3C
36 pages
FrozenLake Q-Learning Guide
No ratings yet
FrozenLake Q-Learning Guide
4 pages
Ass1 Merged Merged
No ratings yet
Ass1 Merged Merged
15 pages
Ex No4rl
No ratings yet
Ex No4rl
3 pages
Q-Learning Implementation in OpenAI Gym
No ratings yet
Q-Learning Implementation in OpenAI Gym
34 pages
Exam
No ratings yet
Exam
7 pages
Implement The KNN
No ratings yet
Implement The KNN
5 pages
Intelligent Optimization Algorithm For Master
No ratings yet
Intelligent Optimization Algorithm For Master
47 pages
CVDL (Practical No. 3)
No ratings yet
CVDL (Practical No. 3)
1 page
1 - All Python Codes + Neo4j Samples
No ratings yet
1 - All Python Codes + Neo4j Samples
16 pages
Tanu Raman ML Lab File
No ratings yet
Tanu Raman ML Lab File
21 pages
MAS Lab7 QFA
No ratings yet
MAS Lab7 QFA
10 pages
RL Unit V Qa
No ratings yet
RL Unit V Qa
13 pages
Q-Learning: Reinforcement Learning Basic Q-Learning Algorithm Common Modifications
No ratings yet
Q-Learning: Reinforcement Learning Basic Q-Learning Algorithm Common Modifications
22 pages
FL QL
No ratings yet
FL QL
5 pages
AI Seminar RL
No ratings yet
AI Seminar RL
27 pages
RLAI Lab 1 Rahel Benjamin
No ratings yet
RLAI Lab 1 Rahel Benjamin
16 pages
Q Learning
No ratings yet
Q Learning
6 pages
21L7734 Shais Quiz3 Aml 8A
No ratings yet
21L7734 Shais Quiz3 Aml 8A
25 pages
Unit 5
No ratings yet
Unit 5
39 pages
Q-Learning for Optimal Pathfinding
No ratings yet
Q-Learning for Optimal Pathfinding
2 pages
Hota ML ReinforcementLearning
No ratings yet
Hota ML ReinforcementLearning
12 pages
Treasure Island MDP Using Value Iteration: Python Code
No ratings yet
Treasure Island MDP Using Value Iteration: Python Code
5 pages
Program Explanation
No ratings yet
Program Explanation
37 pages
Practical
No ratings yet
Practical
6 pages
13-RL DRL
No ratings yet
13-RL DRL
102 pages
ML Assignment 3
No ratings yet
ML Assignment 3
11 pages
Heuristic Search
No ratings yet
Heuristic Search
8 pages
AIML Final Programs
No ratings yet
AIML Final Programs
8 pages
Unit 5
No ratings yet
Unit 5
65 pages
Q-Learning Algorithm
No ratings yet
Q-Learning Algorithm
13 pages
ML - 6 - Jupyter Notebook
No ratings yet
ML - 6 - Jupyter Notebook
5 pages
Reinforcement Learning - Ipynb - Colaboratory
No ratings yet
Reinforcement Learning - Ipynb - Colaboratory
7 pages
Q Learning
No ratings yet
Q Learning
9 pages
Lecture Doubts
No ratings yet
Lecture Doubts
2 pages
Exp1 D16AD 60
No ratings yet
Exp1 D16AD 60
11 pages
ML - Unit 3 - Part II
No ratings yet
ML - Unit 3 - Part II
51 pages
39-Q Learning Numerical
No ratings yet
39-Q Learning Numerical
13 pages
Lab-5 Report
No ratings yet
Lab-5 Report
11 pages
RL Theory Tutorial
No ratings yet
RL Theory Tutorial
80 pages
Reinforcement Learning with Gymnasium
No ratings yet
Reinforcement Learning with Gymnasium
77 pages
Practical No4,5
No ratings yet
Practical No4,5
7 pages
RL Exp 5
No ratings yet
RL Exp 5
2 pages
RLDL
No ratings yet
RLDL
23 pages
MDP Algorithms: Value & Policy Iteration
No ratings yet
MDP Algorithms: Value & Policy Iteration
24 pages
01 Module 1 Early Reinforcement Learning
No ratings yet
01 Module 1 Early Reinforcement Learning
134 pages
Ai Unit 2
No ratings yet
Ai Unit 2
4 pages
CS3491 - Ai & ML University Practical Questions
No ratings yet
CS3491 - Ai & ML University Practical Questions
16 pages
Ex No2 RL
No ratings yet
Ex No2 RL
3 pages
Deep Learning Binoy-19-3-RL Q Learning
No ratings yet
Deep Learning Binoy-19-3-RL Q Learning
26 pages
Adobe Scan Nov 18, 2024
No ratings yet
Adobe Scan Nov 18, 2024
13 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
10 pages
Assignment Week6 AI4ICPS
No ratings yet
Assignment Week6 AI4ICPS
11 pages
MarkovDecisionProcesses Analysis
No ratings yet
MarkovDecisionProcesses Analysis
10 pages
AIML Lab
No ratings yet
AIML Lab
42 pages
3 Model of Distributed System Architecture
No ratings yet
3 Model of Distributed System Architecture
2 pages
Electronic Mail
No ratings yet
Electronic Mail
3 pages
End-to-End Protocols - Deep Dive: 1. Simple Demultiplexer (UDP)
No ratings yet
End-to-End Protocols - Deep Dive: 1. Simple Demultiplexer (UDP)
9 pages
5 Backpropagation
No ratings yet
5 Backpropagation
2 pages
Ubi Comp
No ratings yet
Ubi Comp
5 pages
8 - K-Nearest Neighbor Algorithm
No ratings yet
8 - K-Nearest Neighbor Algorithm
2 pages
Report PDF
No ratings yet
Report PDF
94 pages
Chapter 3 System of Equations
No ratings yet
Chapter 3 System of Equations
48 pages
Sampling in Signals & Systems
No ratings yet
Sampling in Signals & Systems
30 pages
Backward Euler Method Lecture Notes
No ratings yet
Backward Euler Method Lecture Notes
2 pages
CO429
No ratings yet
CO429
4 pages
Deep Learning UNIT-II Part1
No ratings yet
Deep Learning UNIT-II Part1
48 pages
TOC Assignment No-1
No ratings yet
TOC Assignment No-1
5 pages
Scikit
No ratings yet
Scikit
4 pages
Exercise #2 28 - 4 - 2025
No ratings yet
Exercise #2 28 - 4 - 2025
7 pages
Gaussian 16 Frequently Asked Questions
No ratings yet
Gaussian 16 Frequently Asked Questions
3 pages
Numerical Computing!!!
No ratings yet
Numerical Computing!!!
8 pages
Mat511 Advanced-Numerical-Methods TH 1.10 Ac26
No ratings yet
Mat511 Advanced-Numerical-Methods TH 1.10 Ac26
2 pages
Understanding NP-Hard and NP-Complete
No ratings yet
Understanding NP-Hard and NP-Complete
15 pages
Topic 25 Dynamic Programming
No ratings yet
Topic 25 Dynamic Programming
38 pages
Activity Selection Problem
No ratings yet
Activity Selection Problem
14 pages
MT 312 Tutorial 3
No ratings yet
MT 312 Tutorial 3
2 pages
Chapter 18
No ratings yet
Chapter 18
31 pages
Octave Image Processing Assignment
No ratings yet
Octave Image Processing Assignment
2 pages
Root Finding (Numericals Method)
No ratings yet
Root Finding (Numericals Method)
14 pages
Cox, Matthews, JCP, 2002, Exponential Time Differencing For Stiff Systems
No ratings yet
Cox, Matthews, JCP, 2002, Exponential Time Differencing For Stiff Systems
26 pages
2350 wksht06 PDF
No ratings yet
2350 wksht06 PDF
1 page
Theory of Computation Overview
No ratings yet
Theory of Computation Overview
21 pages
Deep Learning Theory Questions
No ratings yet
Deep Learning Theory Questions
3 pages
Daa Unit-1 PPT-5
No ratings yet
Daa Unit-1 PPT-5
8 pages
Mathematics
No ratings yet
Mathematics
2 pages
Deep Learning for Vision Experts
No ratings yet
Deep Learning for Vision Experts
28 pages
Chapter 1 Errors
50% (2)
Chapter 1 Errors
22 pages
Compact Synthetic Division Guide
No ratings yet
Compact Synthetic Division Guide
2 pages
Cost Function of Logistic Regression
No ratings yet
Cost Function of Logistic Regression
6 pages
Gauss Elimination and Jordan
No ratings yet
Gauss Elimination and Jordan
8 pages
Thesis On Model Order Reduction
100% (2)
Thesis On Model Order Reduction
8 pages

10.Q Learning Algorithm

Uploaded by

10.Q Learning Algorithm

Uploaded by

PROGRAM-10

# Define the environment

n_states = 16 # Number of states in the grid world

n_actions = 4 # Number of possible actions (up, down, left, right)

goal_state = 15 # Goal state

# Initialize Q-table with zeros

Q_table = [Link]((n_states, n_actions))

for epoch in range(epochs):

current_state = [Link](0, n_states) # Start from a random state

while current_state != goal_state:

# Choose action with epsilon-greedy strategy

if [Link]() < exploration_prob:

action = [Link](0, n_actions) # Explore

action = [Link](Q_table[current_state]) # Exploit

# Simulate the environment (move to the next state)

# For simplicity, move to the next state

next_state = (current_state + 1) % n_states

# Define a simple reward function (1 if the goal state is reached, 0 otherwise)

reward = 1 if next_state == goal_state else 0

# Update Q-value using the Q-learning update rule

Q_table[current_state, action] += learning_rate * \

[Link](Q_table[next_state]) - Q_table[current_state, action])

current_state = next_state # Move to the next state

# After training, the Q-table represents the learned Q-values

You might also like