Virtual Lab Assistant
Virtual Lab Assistant
Certificate
This is to certify that the Dissertation report entitled Securing Digital Assets: Intelligent
Nallamanti Jeevana priya bearing registration number (23VV1F0022) in partial fulfillment for
the degree of
ii
iii
DECLARATION
I Nallamanti Jeevana priya (Reg. No: 23VV1F0022), declare that this written submission
represents my ideas in my own words and where others ideas or words have been included. I have
adequately cited and referenced the original sources. I also declare that I have adhered to all
principles of academic honesty and integrity and have not misrepresented or fabricated or falsified
any idea data fact source in my submission. I understand that any violation of the above will be
cause for disciplinary action by the institute and can also evoke penal action from the sources
which have thus not been properly cited or from whom proper permission has not been taken
when
needed.
(Signature)
Nallamanti jeevana priya
(23VV1F0012)
Date :
Place :
iv
ACKNOWLEDGEMENT
This acknowledgment transcends the reality of formality when I express deep gratitude and
respect to all those people behind the screen who inspired and helped us in the completion of this
project work.
I take the privilege to express my heartfelt gratitude to my guide Dr. G. JAYA SUMA, Registrar and
Professor of Information Technology, JNTUGV-CEV for her valuable suggestions and constant
motivation that greatly helped me in the successful completion of the project. Wholehearted
cooperation and the keen interest shown by her at all stages are beyond words of gratitude.
With great pleasure and privilege, I wish to express my heartfelt sense of gratitude and
indebtedness to Dr. Ch. BINDU MADHURI, Assistant Professor, Head of the Department of
Information Technology, JNTUGV-CEV, for her supervision.
I express my sincere thanks to Project Coordinator Dr. Ch. BINDU MADHURI, Assistant Professor
, Head of the Department of Information Technology, JNTU-GURAJADA VIZIANAGARAM for her
continuous support.
I extend heartfelt thanks to our principal Prof. R. RAJESWARA RAO for providing intensive
support throughout my project.
I am also thankful to all the Teaching and Non-Teaching staff of the Information Technology
Department, JNTUGV-CEV, for their direct and indirect help provided to me in completing the
project.
I extend my thanks to my parents and friends for their help and encouragement in the success of
my project
[Link]
priya(23VV1F0
022)
vi
Abstract
Learning core programming concepts such as Data Structures and Algorithms (DSA)
remains a challenge for many students, especially when taught through conventional
lectures and text-heavy resources. Visual learning and interactivity have proven to
significantly enhance comprehension and engagement. This project presents the design
and development of an intelligent, interactive web-based platform titled Virtual Lab
Assistant, aimed at simplifying the learning of programming fundamentals.
The system is built using the MERN stack (MongoDB, [Link], [Link], [Link]) to
deliver a scalable and responsive user experience. It provides animated visualizations of
core DSA operations—such as stack push/pop, queue enqueue/dequeue, and linked list
traversal—allowing users to input values and observe algorithm behavior step by step.
The platform also features a built-in code editor supporting C, Java, and Python with
preloaded sample codes to facilitate practice.
Additionally, a lightweight machine learning layer is integrated to enhance topic
recommendations, provide voice-based interaction, and adapt content based on user
behavior. This fusion of full-stack web development with ML intelligence makes the
Virtual Lab Assistant a robust learning companion for students and educators alike.
The system aims to promote self-paced, visual, and practice-driven learning, ultimately
transforming theoretical knowledge into practical mastery.
Keywords:
• Data Structure Visualization
• Virtual Lab Assistant
• MERN Stack
• Machine Learning in Education
• Interactive Programming Learning
• IDE Integration
• Voice-Based Interface
• Student-Centric Learning Platforms
Contents
Acknowledgements vi
Abstract vii
List of Figures 10
1 Introduction 2
1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Scope .......................................... 4
1.5 Introduction to Machine Learning .......................... 4
1.5.1 Working of Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.5.2 Pros and Cons of Machine Learning ..................... 5
1.6 Introduction to Django . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.6.1 Working of Django . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.6.2 Count Vectorizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.6.3 TF-IDF Vectorizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.7 Introduction to Similarity Measures ......................... 7
1.7.1 Cosine Similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.7.2 Jaccard Similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7
1.8 Aim of the Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 Literature Review 10
2.1 Existing Systems for Phishing Detection . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 Existing Systems for PDF Malware Detection . . . . . . . . . . . . . . . . . . . . 11
2.3 Limitations in Current Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4 Research Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3 Software and Hardware Requirements 14
3.1 Software Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2 Library Versions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.3 Hardware Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4 Methodology 18
4.1 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.2 Dataset Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Contents ix
5 Implementation 27
5.1 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.1.1 URL Dataset Sources ............................. 27
5.1.2 PDF Dataset Sources ............................. 27
5.2 Feature Engineering .................................. 28
5.2.1 Phishing URL Feature Extraction ...................... 28
5.2.2 PDF Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
5.3 Model Building and Serialization . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.3.1 Model Building . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.3.2 Model Serialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.4 Django Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.4.1 Views and Routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.4.2 Templates and Forms ............................. 32
5.5 Front-End Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.6 Deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6 Model Comparison 36
6.1 Phishing URL Detection Models ........................... 36
6.2 PDF Malware Detection Models ........................... 37
6.3 Evaluation Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
6.4 Model Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
7 Results 41
7.1 Phishing URL Detection Model Performance .................... 41
7.2 PDF Malware Detection Model Performance .................... 42
7.3 Web Interface and User Experience . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Appendix 48
A Sample Input URLs and PDFs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
B Screenshots of Project Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Bibliography 51
List of Figures
1.1 Machine Learning Pipeline for Threat Detection .................. 5
2.1 Typical Architecture of a Phishing Detection System . . . . . . . . . . . . . . . . 11
2.2 Static Feature-Based Workflow for PDF Malware Detection . . . . . . . . . . . . 11
4.1 System Architecture for Intelligent Detection . . . . . . . . . . . . . . . . . . . . 19
4.2 Phishing URL Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.3 PDF Malware Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.4 Feature Extraction Pipeline for URL and PDF Inputs . . . . . . . . . . . . . . . 22
4.5 Django Flow Integration with Machine Learning Models . . . . . . . . . . . . . . 25
5.1 Django Web Interface for Phishing and PDF Malware Detection ......... 33
5.2 Layered Deployment Diagram of the Phishing and PDF Detection System . . . . 34
6.1 Accuracy of Different Models for Phishing URL Detection . . . . . . . . . . . . . 37
6.2 Accuracy of Different Models for PDF Malware Detection . . . . . . . . . . . . . 38
6.3 Important Features for Phishing URL Detection .................. 39
6.4 Important Features for PDF Malware Detection . . . . . . . . . . . . . . . . . . . 39
7.1 Confusion Matrix - Phishing URL Detection (XGBoost) . . . . . . . . . . . . . . 41
7.2 Confusion Matrix - PDF Malware Detection (XGBoost) . . . . . . . . . . . . . . 42
7.3 Phishing URL Detection Form ............................ 43
7.4 PDF Malware Detection Upload Interface . . . . . . . . . . . . . . . . . . . . . . 43
1 Detecting the legitimate URLs ............................ 49
2 Detecting the Phishing URLs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3 Detecting the Benign PDF .............................. 50
4 Detecting the Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
x
Chapter 1 – Introduction
1.1 Overview
To address these evolving needs, the Virtual Lab Assistant has been developed
as an intelligent, web-based application using Python Flask. This system is
designed to assist students and researchers in conducting virtual experiments,
accessing laboratory resources, and receiving real-time, AI-driven guidance. It
provides a centralized platform for managing lab activities, ensuring improved
efficiency, accuracy, and accessibility.
1.3 Objectives
The Virtual Lab Assistant project is driven by the following key objectives:
To develop a Flask-based web application that offers virtual lab assistance.
To integrate AI-driven support for answering procedural questions and
troubleshooting issues.
To implement voice-assisted guidance and navigation through lab experiments.
To provide digital manuals, equipment usage guidelines, and safety
instructions.
To allow students to generate structured experiment reports automatically.
To monitor and log lab activity for further review by instructors.
To create a scalable solution adaptable to multiple departments and lab types.
1.4 Scope
This project is primarily focused on creating a digital assistant for science and
engineering labs within educational institutions. The Virtual Lab Assistant
addresses the gap between traditional lab setups and modern educational
technologies by offering:
Web-based interaction for accessing lab procedures and tools.
AI-powered support for step-by-step experiment execution.
Voice commands for hands-free lab navigation.
Automation in documentation and evaluation of student performance.
The assistant is intended for use in computer science, electronics, physics, and
related lab environments. While this version is tailored for academic use, the
architecture is scalable and can be expanded to industrial lab settings, mobile
platforms, or integrated with Internet of Things (IoT) devices for real-time
equipment monitoring in future iterations.
The core aim of this project is to develop a smart, user-friendly, and interactive
lab assistant that supports students in conducting experiments independently
and accurately. It seeks to minimize errors, enhance accessibility, and reduce
the reliance on physical instructors by using AI and modern web technologies.
This system aligns with the growing demand for blended learning, offering a
hybrid model of education that supports both physical and virtual modes of
experimentation.
Chapter 2 – Literature
Review
2.1 Existing Systems for Lab
Automation
Over the last decade, several systems have been introduced to modernize
laboratory practices and support educational institutions in managing lab
infrastructure more efficiently. Traditional lab management software typically
focuses on inventory tracking, lab scheduling, and equipment allocation. While
these tools help streamline operations, they often lack interactive features or
intelligent assistance for students during experiments.
Some academic institutions have adopted Learning Management Systems
(LMS) integrated with lab modules, such as Moodle with virtual lab extensions.
These systems allow students to upload results or follow experiment
instructions but provide limited real-time assistance. Additionally, commercial
lab automation tools like LabWare and STARLIMS are designed for industrial
use and are generally not tailored for educational settings or student
engagement.
Furthermore, systems like Virtual Labs by NPTEL and Amrita Vishwa
Vidyapeetham provide pre-recorded simulations and manual-based execution
of experiments. However, they do not offer dynamic feedback, voice guidance,
or AI-driven interaction, which are crucial for individualized support and
accessibility.
AI-driven assistants like IBM Watson Tutor, Socratic by Google, and custom
chatbots are now used to answer student queries, recommend learning
materials, and guide problem-solving in various subjects. In lab environments,
AI has the potential to:
Detect procedural errors.
Suggest corrective steps.
Answer questions about instruments and theory.
Enable adaptive learning based on student input.
Despite these advancements, there are few implementations where AI is
directly used to assist hands-on experiments, especially in the form of a voice-
enabled lab guide embedded in a web application.
Voice user interfaces (VUIs) are becoming increasingly popular for accessibility
and ease of use. Systems like Amazon Alexa and Google Assistant have
normalized voice-based interaction in everyday tasks. In the educational
domain, VUIs can enhance the experience for users who are visually impaired
or those performing hands-on activities that make screen interaction difficult.
A few experimental systems have attempted to use speech synthesis and
recognition to provide navigational assistance in simulated environments.
However, voice-guided support during lab experiments is still underexplored,
especially in platforms meant for education.
While several lab automation and virtual learning tools exist, they often fall
short in the following aspects:
Lack of real-time, contextual guidance during experiments.
No support for natural language queries or intelligent error correction.
Minimal integration of AI or voice interfaces for enhanced accessibility.
Inability to personalize the learning experience based on user input.
Dependence on static content or simulations, lacking interactive and adaptive
components.
These gaps highlight the need for an intelligent lab assistant that can actively
guide students and support instructors with modern technologies.
Chapter 3 – System
Analysis
3.1 Introduction
This chapter presents a comprehensive analysis of the system requirements
and design approach for the Virtual Lab Assistant. The goal is to assess the
technical, operational, and user-centric aspects to ensure the system is feasible,
scalable, and easy to use in academic environments. The chapter also outlines
the hardware and software resources necessary for successful development
and deployment.
Though systems like Virtual Labs (e.g., by NPTEL) and simulations exist, they
do not offer AI-based support, voice navigation, or personalized real-time
assistance, which leaves a significant gap in student experience and
accessibility.
3.3 Proposed System Overview
Digital access to lab manuals, safety guides, and equipment usage instructions.
User authentication and session tracking for accountability.
This system enables students to independently perform lab tasks with
confidence, reduces the need for constant instructor intervention, and
improves overall lab efficiency.
a) Technical Feasibility
b) Operational Feasibility
The assistant simplifies lab activities for students and reduces instructor
workload.
It promotes self-learning and supports visually impaired students via voice
assistance.
Can be easily adapted for different lab subjects and expanded to more
institutions.
c) Economic Feasibility
The project is built entirely with open-source technologies.
No additional infrastructure cost is required for basic usage.
Hosting and deployment can be done on free tiers of cloud platforms.
3.9 Constraints
Requires a microphone and speaker for voice functionality.
Needs stable internet for cloud-based deployment or updates.
Speech recognition accuracy may vary based on environment noise.
3.10 Assumptions
Users (students/instructors) have basic computer literacy.
Devices used support modern browsers and audio features.
Lab content (manuals, procedures) is provided by the institution.
Chapter 4 – System
Design
4.1 Introduction
System design is a crucial phase in software development that transforms
requirements into a structured solution architecture. It ensures that all
functional and non-functional requirements are addressed in a scalable,
efficient, and user-friendly manner. In this chapter, we present the high-level
design of the Virtual Lab Assistant, including system architecture, module
descriptions, data flow diagrams, and user interface design principles.
3. Data Layer:
Stores lab manuals, user activity logs, experiment results, and generated
reports.
+-------------------------------+
| User Interface |
| (HTML/CSS/Bootstrap Forms) |
+-------------------------------+
|
↓
+-------------------------------+
| Flask Backend |
|------------------------------|
| - Routing & Views |
| - AI Query Engine |
| - Text-to-Speech (pyttsx3) |
| - Speech Recognition |
| - Report Generator (PDF) |
+-------------------------------+
|
↓
+-------------------------------+
| Data Storage |
| - Lab Manuals (Static Files) |
| - Report Logs |
| - User Info / Sessions |
+-------------------------------+
---
+--------------------+
| Student |
+--------------------+
|
+------------+------------+
| |
+------------------+ +---------------------+
| Upload Lab Input | | Ask for Guidance |
+------------------+ +---------------------+
| |
↓ ↓
+-----------------------------------------+
| Flask Server & AI Logic Engine |
+-----------------------------------------+
| |
+------------------+ +----------------------+
| Generate Report | | Respond with Voice |
+------------------+ +----------------------+
↓ ↓
+-------------------------+ +-------------------------+
| PDF Report Sent to User | | Voice Output to Browser |
+-------------------------+ +-------------------------+
3. AI Guidance Engine
Handles lab procedure queries using rule-based or NLP techniques.
Static files or database entries for lab experiments, safety instructions, and
equipment guides.
5. Report Generator
Chapter 5 –
Implementation
5.1 Introduction
This chapter outlines the practical development of the Virtual Lab Assistant,
focusing on how the system was built and how its components interact. Each
module, from voice recognition to report generation, was implemented using
Python Flask and integrated seamlessly into a functional web application. The
implementation aimed to offer a virtual learning experience that simulates
real-world laboratory guidance.
@[Link]('/')
def index():
return render_template('[Link]')
@[Link]('/start')
def start_lab():
step = get_step(0)
speak(step)
return render_template('lab_session.html', step=step)
5.3.2 Voice Assistant (Text-to-Speech)
The voice assistant was implemented using pyttsx3, which allows offline text-
to-speech conversion.
import pyttsx3
def speak(text):
engine = [Link]()
[Link](text)
[Link]()
This function is called every time a new experiment step is loaded or when the
user requests a repeat.
import speech_recognition as sr
def listen_command():
recognizer = [Link]()
with [Link]() as source:
audio = [Link](source)
return recognizer.recognize_google(audio).lower()
{
"experiment": "Ohm's Law",
"steps": [
"Connect the circuit components.",
"Turn on the power supply.",
"Measure the voltage.",
"Record the current.",
"Apply the formula V = IR."
]
}
Features:
Large buttons for voice actions (Start, Repeat, Help).
Instruction window for step display.
Real-time visual feedback for voice commands.
This chapter has described the complete implementation of the Virtual Lab
Assistant, from core Python modules to frontend design and backend
integration. Using Flask, voice libraries, and a modular structure, the system
was developed to deliver an intelligent, hands-on lab experience that students
can access remotely or in the classroom. Every component was developed with
simplicity, scalability, and accessibility in mind.
Modules such as Flask views, AI logic, and voice functions were integrated and
tested for proper communication.
Scenario Description Result
Voice command triggers Flask view Saying “next” loads next step in UI
Successful
Instruction spoken after input Experiment step read aloud via text-to-
speech
✅ Successful
Report download after completion Generated report is downloadable in
browser
✅ Successful
End-to-end testing was done from login to report generation, simulating actual
user behavior.
Action Performed Expected Outcome Status
Start experiment via button First step loads and is spoken - Passed
Say “Repeat” Step is repeated verbally -Passed
Say “Help” System provides help message -Passed
Complete experiment Report generated and download shown -Passed
Chapter 7 – Conclusion
and Future Work
7.1 Conclusion
The Virtual Lab Assistant project successfully addresses the modern
educational need for intelligent, accessible, and interactive laboratory support.
By integrating AI-based logic and voice-enabled interaction into a Flask-based
web application, the system empowers students to perform experiments with
minimal supervision while still maintaining guidance and procedural accuracy.
7.3 Limitations
Despite the successful implementation, the system has certain limitations:
Speech recognition may not work optimally in noisy environments.
The AI assistant follows a rule-based model, limiting complex query handling.
Offline voice features like pyttsx3 may have compatibility issues on certain
devices.
The system currently focuses on procedural labs; simulation-based or
hardware-involved labs may require additional integration.