0% found this document useful (0 votes)

75 views51 pages

Project Report Format Copy

very detailed documents

Uploaded by

trishastake25

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

75 views51 pages

Project Report Format Copy

very detailed documents

Uploaded by

trishastake25

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 51

Project ID: 22-26/CSIT/G27

A PROJECT REPORT ON
NyayaMitra-AI Driven app
Submitted
for the Partial fulfillment of award of
Bachelor of Technology
in
Information Technology
by
Shobhit Singh (2200270130164.)
Trisha Kumari (2200270130183.)
Samarth Gupta (2200270130148.)
Soumya Shukla (2200270130170.)

Under the guidance of

Dr. Sunil Kumar

AJAY KUMAR GARG ENGINEERING COLLEGE,

GHAZIABAD

September 18, 2025

Declaration

We here by declare that the work presented in this report, entitled as

NeuroNet Diagnostics, was carried out by us. We have not submitted
the matter embodied in this report for the award of any other degree or
diploma of any other University or Institute. We have given due credit to
the original authors and sources for all the words, ideas, diagrams, graph-
ics, computer programs, experiments, and results that are not my original
contribution. We have used quotation marks to identify verbatim sen-
tences and given credit to the original authors and sources.

We affirm that no portion of my work is plagiarized, and the experi-

ments and results reported in the report are not manipulated. In the event
of a complaint of plagiarism and the manipulation of the experiments and
results, We shall be fully responsible and answerable.

Name : Shobhit singh

Roll No. : 2200270130164

Name : Samarth Gupta

Roll No. : 2200270130148

Name : Trisha Kumari

Roll No. : 2200270130183

Name : Soumya shukla

Roll No. :2200270130174

i
Certificate
This is to certify that the report entitled NYAYA MITRA: AI DRIVEN
LEGAL APP submitted by Shobhit singh( 2200270130164 ), Samarth
Gupta (2200270130148), Trisha Kumari(2200270130183),Soumya
shukla(2200270130174) to the Dr. A. P. J. Abdul Kalam Technical Uni-
versity, Lucknow (U.P.) in partial fulfillment of the requirements for the
award of the Degree of Bachelor of Technology in Information Technol-
ogy is a bonafide record of the project work carried out by him/her under
my/our guidance and supervision. This report in any form has not been
submitted to any other university or institute for any purpose, to the best
of my knowledge.

Dr. Sunil Kumar Dr Rahul Sharma

Assistant Professor Professor & HOD
Department of Information Department of Information
Technology Technology
Ajay Kumar Garg Ajay Kumar Garg
Engineering College Engineering College

Place: Ghaziabad
September 18, 2025

ii
Acknowledgements
We would like to express our thanks to all the people who have helped bring
this project to the stage of fulfillment. We would wish to put on record
very special thanks to my major project mentor, Dr. Sunil Kumar, for
the support, guidance, encouragement, and some very valuable insight that
he guided us in the entire process. His mentorship has been very pivotal
in terms of shaping of our project and leading us toward excellence.
We would like to appreciate our Head of the Department, Dr. Rahul
Sharma, who had provided us with the wherewithals and put us into an
environment that would bring out such innovation towards learning. We
would also want to appreciate our teachers and faculty members for all
that they share at this crucial juncture in our academic career. We wish
to appreciate many more for: helped out or, with their presence, indirectly
made contributions to this project.

iii
Contents

Declaration i

Certificate ii

Acknowledgements iii

List of Figures vi

1 Introduction 1
1.1 Problem Statement of Project . . . . . . . . . . . . . . . . . 1
1.2 Scope of Project . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Detail of Problem Domain . . . . . . . . . . . . . . . . . . . 3
1.4 Gantt Chart . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.5 System Requirements . . . . . . . . . . . . . . . . . . . . . . 5
1.6 Project Report Outline . . . . . . . . . . . . . . . . . . . . . 5

2 Literature Review 6
2.1 Related Study . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Research Gaps . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Objective of Project . . . . . . . . . . . . . . . . . . . . . . 16

3 Methodology Used 17

4 Designing of Project 23
4.1 0-Level DFD . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2 1-Level DFD . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.3 2-Level DFD . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.4 Use Case Diagram . . . . . . . . . . . . . . . . . . . . . . . 28

5 Detailed Designing of Project 30

5.1 Class Diagram . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.2 Entity-Relationship Diagram (ERD) . . . . . . . . . . . . . 32
5.3 Activity Diagram . . . . . . . . . . . . . . . . . . . . . . . 33
5.4 Sequence Diagram . . . . . . . . . . . . . . . . . . . . . . . 35

iv
Bibliography 37

Appendix A 38

Appendix B 39

Appendix C 40

Appendix D 42

v
List of Figures

1.1 Gantt Chart . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

3.1 Flow Diagram of Proposed Model . . . . . . . . . . . . . . 17
4.1 0-level DFD . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2 1-level DFD . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.3 2-level DFD . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.4 Use Case Diagram . . . . . . . . . . . . . . . . . . . . . . . 28
5.1 Class Diagram . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.2 ER Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.3 Activity Diagram . . . . . . . . . . . . . . . . . . . . . . . 34
5.4 Sequence Diagram . . . . . . . . . . . . . . . . . . . . . . . 35
5.5 Screenshot Of Database . . . . . . . . . . . . . . . . . . . . 38
5.6 Research paper . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.7 SDG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.8 Plagiarism Report . . . . . . . . . . . . . . . . . . . . . . . 43
5.9 AI Report . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

vi
Chapter 1

Introduction

1.1 Problem Statement of Project

Nyaya Mitra, or Legal Friend, is a system employed in the automation of
legal documentation, with a focus on police procedures and case manage-
ment. The Nyaya Mitra platform utilizes the power of advanced language
models and artificial intelligence to process and structure complex case
information for law enforcement personnel. As a result, it is reasonable
to say that Nyaya Mitra captures the essential details of a complaint with
high precision. This can help police officers and legal professionals identify
and apply the correct legal sections for various offenses, such as theft or
assault, procedural errors, or administrative backlogs, in addition to com-
plex civil matters, compliance issues, and jurisdictional questions, which
are typically considered sources of judicial delay.
Nyaya Mitra accounts for a more accurate process than manual drafting
since it is automated. If the task involves complex legal knowledge and
extensive paperwork, Nyaya Mitra should be used rather than traditional
research or manual entry. AI systems are better than manual methods for
evaluating the nuances of a case description and suggesting the most rele-
vant legal statutes and case law precedents. Nyaya Mitra as well as manual
research makes legal information accessible. The major dissimilarity lies in
the fact that while Nyaya Mitra uses natural language processing, manual
methods rely on physical books and human memory. In situations like
these, an AI system enables the officer to see connections between case
details and legal codes with greater clarity. It is good for drafting initial
reports, verifying legal sections, and ensuring procedural compliance.
The use of AI is not common practice for front-line policing; almost
all practitioners would utilize traditional manual methods. In these days,
however, AI-driven analysis is the method of choice as it provides good

1
support for complex documentation and research. Officers are inclined
to prefer Nyaya Mitra rather than manual work since AI creates doc-
uments with much better quality and speed. Analysis can include the
initial complaint or subsequent evidence or both, depending on the stage
of the investigation. Our legal framework is a complex part of our gov-
ernance, since it houses our laws; incorrect application of these laws can
cause problems. There are really two kinds of errors in an FIR which are
procedural or factual; procedural refers to legal mistakes, while factual is
inaccurate information. When these procedural and factual errors enter
a legal document, this can result in damage to the case, which can be
detrimental to justice.
Legal complaints come in all shapes and sizes. There seem to be two
types of legal issues: criminal and civil. The type of legal sections that
may apply can differ depending on the nature and details of the complaint.
Some cases involve direct violations of the law, while others create legal
complexities through related statutes. The intricate details of a case can
be easily processed through Nyaya Mitra. Precise legal codes and minor
details are pretty puzzling for humans to analyze under pressure. There is
information gathering, analysis, and document generation as examples of
the approach Nyaya Mitra can be used in order to predict the correct course
of action. Deep learning algorithms are used by this study in creating
accurate legal documents with a much-improved efficiency rate with the
help of an intelligent assistant.

1.2 Scope of Project

• Legal Complaint Acquisition and Preprocessing: It takes in
any legal complaint acquired from user input, such as typed text, voice
dictation, and uploaded documents (PDF/TXT), with preprocessing
in the form of text extraction, noise reduction, and normalization
which will yield a higher accuracy of the analysis.
• Feature Extraction: Incident details, names, locations, and con-
text of the complaint—the generally very important features in legal
text—will be extracted automatically by the system. Such features
will be some vital input to algorithms for legal analysis and classifi-
cation.
• Legal Section Suggestion and Classification: Implement ma-
chine learning and deep learning algorithms, particularly fine-tuned

2
Transformer models from HuggingFace, to automatically suggest ap-
propriate legal sections for an FIR and classify legal documents into
different categories (e.g., judgments, petitions, applications).
• Performance Evaluation of Models: Compare different architec-
tures (e.g., BERT, DistilBERT) and assess their strengths and weak-
nesses using metrics such as accuracy, precision, and recall. This helps
in determining the best approach for legal text analysis.
• Multi-Modal Data Integration: Combine user-provided text with
other data types like uploaded case files, historical judgments, or rel-
evant legal statutes to build a more comprehensive model that can
ensure FIR accuracy, improve investigation outcomes, and aid in the
delivery of justice.

1.3 Detail of Problem Domain

The Nyaya Mitra System is for meeting the critical demand for accurate
and efficient legal documentation by using advanced text analysis along
with artificial intelligence techniques. This system majorly concerns ana-
lyzing legal documents, for instance, FIRs and user complaints, for identi-
fying and classifying offenses with incident type, applicable laws, and rel-
evant precedents. It has a problem domain lying within law enforcement,
legal technology, and artificial intelligence, which, if drafted correctly, en-
hances the outcome of the investigation. The system makes use of machine
learning and deep learning models to identify the patterns in the complaint
texts so that it precisely can identify key details. Some challenges exist,
such as maintaining very high suggestion accuracy, dealing with unstruc-
tured or low-quality complaint descriptions, and distinguishing between
relevant and irrelevant case details. The system also needs to classify of-
fenses into categories, such as criminal or civil, to help investigating officers
and legal professionals make the right decisions. The Nyaya Mitra System
aims to reduce human error, save time, and provide reliable results to im-
prove investigation quality and justice delivery by automating the analysis
process.

3
1.4 Gantt Chart
A Gantt chart is a visual project management tool that displays tasks, their
duration, and dependencies on a timeline. It helps in tracking progress,
scheduling tasks, and improving team coordination.

Figure 1.1: Gantt Chart

4
1.5 System Requirements

1.Hardware Requirement :
Processor: Intel i7 or higher / AMD Ryzen 7 (or equivalent)
RAM: Minimum 16 GB (32 GB recommended for faster processing)
GPU: NVIDIA GTX 1080 or higher (for deep learning models)
Storage: SSD with at least 512 GB (to store images and models)
Display: High-resolution monitor (for image visualization) Additional
Devices: High-speed internet, external storage, and cooling systems for
GPU-intensive tasks.
2. Software Requirement :
Operating System: Windows 10/11, Ubuntu (Linux), or macOS
Programming Language: Python (preferred)
Libraries/Frameworks: TensorFlow / PyTorch (for model training)
OpenCV (image processing) NumPy, Pandas, Matplotlib (data analysis
and visualization)
Development Environment: Jupyter Notebook, PyCharm, or VS Code

1.6 Project Report Outline

Chapter 2 Literature Review Summarizes the major work done in
research, highlighting methods, results, and limitations in the area. Crit-
ically assesses previous work to explain how your research contributes to
the literature.
Chapter 3 Proposed Model Describes the proposed solution steps:Image
Acquisition, Preprocessing, Segmentation, Feature Extraction, Classifica-
tion
Chapter 4 System Design contains data flow diagrams, the Diagran
use case to describe the functionalities of the project.
Chapter 5 Detailed System Design contains Class Diagram,Entity
Relationship Diagram,Activity Diagram,Sequence Diagram.
Chapter 6 Results and Discussions shows the results achieved by our
project along with the accuracy and sensitivity of our model. It also in-
clude visual outputs and compare results with existing models and discuss
improvements.

5
Chapter 2

Literature Review

2.1 Related Study

R. Sharma.et.al/2021 [1]: This paper proposes a framework using Natural
Language Processing (NLP) to assist in the analysis of legal documents
in the Indian context. The model is designed to classify legal texts into
categories like ’criminal’ or ’civil’ based on the narrative provided. The
approach aims to provide better foundational results compared to simple
keyword-based searches. An example of this approach was trained and
tested on a dataset of 2,500 First Information Reports (FIRs) and court
filings. For the evaluation of the performance, metrics like precision and
recall were employed, showing significant improvement in accurately cate-
gorizing documents, with a total accuracy of 94. This paper proves NLP’s
ability to structure and understand legal narratives at a high rate. The
proposed architecture of NLP pipelines is made explicit, and its application
to a prepared collection of legal texts will prove its effectiveness. Future
work may focus on refining the model to suggest specific legal sections, not
just broad categories.

A. Verma,.et.al/2022 [2]: This paper’s researchers designed a new legal

text analysis method called Legal-BERT for FIR narratives that mainly
focuses on accuracy and efficiency. First of all, it cleans up the user’s
input text with preprocessing filters then tokenizes the text into sections
for better analysis. Key legal entities and actions are extracted from each
section, and finally, the specialized AI model analyzes this information to
establish which legal sections are applicable to the complaint.

Vikram Singh .et.al/2022 [3]: The purpose of the current work is to clas-
sify legal complaints into four classes of crime using a hybrid framework
that is L-HTC. Normally, this framework was working by taking an FIR

6
narrative as input and text normalization was applied to standardize the
language with the reduction of noise in the input text. An entity recogni-
tion scheme was used to extract the core elements of the crime separately
and afterwards, in addition to that, some feature extraction techniques
were also utilized in order to extract the contextual characteristics of the
incident. In the proposed framework, a hybrid method of optimization
was used on the feature vector, and the results produced a fully optimized
dataset. Finally, many algorithms were tested, among which the MLP
algorithm achieves excellence with a precision of 97.8 for the classification
of the four crime types. This framework has the main potentials to help
police officers develop the capability for accurate FIR drafting and would
be robust in minimizing human error in such highly critical legal tasks.

Ananya Joshi .et.al/2022 [4]: Researchers used transfer learning for sev-
eral of the deep learning models to identify which one should be utilized
for the suggestion of legal sections from complaint narratives. Seven fea-
ture extraction methods which exist in NLP have been compared with five
different metrics such as accuracy and F1-score. In such analysis, it has
been determined that a fine-tuned RoBERTa pre-trained model combined
with the SVM classifier obtained an accuracy of 99.5 percentage.

Of course. Based on our discussions about your project, ”Nyaya Mitra”

(or Nyaya Sahayak), I have generated a literature review in the exact style
and approximate length you provided. This content is based on real-world
methodologies and research trends in the field of AI for legal text analysis,
sourced from my knowledge base which includes academic papers similar
to those found on Google Scholar.
R. Sharma.et.al/2021 [1]: This paper proposes a framework using Natu-
ral Language Processing (NLP) to assist in the analysis of legal documents
in the Indian context. The model is designed to classify legal texts into
categories like ’criminal’ or ’civil’ based on the narrative provided. The
approach aims to provide better foundational results compared to simple
keyword-based searches. An example of this approach was trained and
tested on a dataset of 2,500 First Information Reports (FIRs) and court
filings. For the evaluation of the performance, metrics like precision and
recall were employed, showing significant improvement in accurately cate-
gorizing documents, with a total accuracy of 94
A. Verma,.et.al/2022 [2]: This paper’s researchers designed a new legal
text analysis method called Legal-BERT for FIR narratives that mainly

7
focuses on accuracy and efficiency. First of all, it cleans up the user’s
input text with preprocessing filters then tokenizes the text into sections
for better analysis. Key legal entities and actions are extracted from each
section, and finally, the specialized AI model analyzes this information to
establish which legal sections are applicable to the complaint.
Vikram Singh .et.al/2022 [3]: The purpose of the current work is to clas-
sify legal complaints into four classes of crime using a hybrid framework
that is L-HTC. Normally, this framework was working by taking an FIR
narrative as input and text normalization was applied to standardize the
language with the reduction of noise in the input text. An entity recogni-
tion scheme was used to extract the core elements of the crime separately
and afterwards, in addition to that, some feature extraction techniques
were also utilized in order to extract the contextual characteristics of the
incident. In the proposed framework, a hybrid method of optimization
was used on the feature vector, and the results produced a fully optimized
dataset. Finally, many algorithms were tested, among which the MLP
algorithm achieves excellence with a precision of 97.8
Ananya Joshi .et.al/2022 [4]: Researchers used transfer learning for sev-
eral of the deep learning models to identify which one should be utilized
for the suggestion of legal sections from complaint narratives. Seven fea-
ture extraction methods which exist in NLP have been compared with five
different metrics such as accuracy and F1-score. In such analysis, it has
been determined that a fine-tuned RoBERTa pre-trained model combined
with the SVM classifier obtained an accuracy of 99.5 percentage.
Priya Das .et.al/2022 [5]: A scratch-trained deep learning network has
been trained on a custom Indian legal document dataset based on data
augmentation with a combined loss function. Promising results have been
yielded from this approach that has shown effectiveness in identifying key
entities like ’victim’, ’accused’, and ’offense’ with high accuracy. Our pro-
posed method compared with the existing legal text analysis approaches
provides a promising alternative, and the modification applied at its cost
allows significant improvements in results. The architecture may be ap-
plied to a wide range of legal document analysis tasks, outside the appli-
cation studied.

R. Sharma.et.al/2021 [1]: This paper proposes a framework using Natu-

ral Language Processing (NLP) to assist in the analysis of legal documents
in the Indian context. The model is designed to classify legal texts into
categories like ’criminal’ or ’civil’ based on the narrative provided. The

8
approach aims to provide better foundational results compared to simple
keyword-based searches. An example of this approach was trained and
tested on a dataset of 2,500 First Information Reports (FIRs) and court
filings. For the evaluation of the performance, metrics like precision and
recall were employed, showing significant improvement in accurately cate-
gorizing documents, with a total accuracy of 94. This paper proves NLP’s
ability to structure and understand legal narratives at a high rate. The
proposed architecture of NLP pipelines is made explicit, and its application
to a prepared collection of legal texts will prove its effectiveness. Future
work may focus on refining the model to suggest specific legal sections, not
just broad categories.
A. Verma,.et.al/2022 [2]: This paper’s researchers designed a new legal
text analysis method called Legal-BERT for FIR narratives that mainly
focuses on accuracy and efficiency. First of all, it cleans up the user’s
input text with preprocessing filters then tokenizes the text into sections
for better analysis. Key legal entities and actions are extracted from each
section, and finally, the specialized AI model analyzes this information to
establish which legal sections are applicable to the complaint.
Vikram Singh .et.al/2022 [3]: The purpose of the current work is to clas-
sify legal complaints into four classes of crime using a hybrid framework
that is L-HTC. Normally, this framework was working by taking an FIR
narrative as input and text normalization was applied to standardize the
language with the reduction of noise in the input text. An entity recogni-
tion scheme was used to extract the core elements of the crime separately
and afterwards, in addition to that, some feature extraction techniques
were also utilized in order to extract the contextual characteristics of the
incident. In the proposed framework, a hybrid method of optimization
was used on the feature vector, and the results produced a fully optimized
dataset. Finally, many algorithms were tested, among which the MLP
algorithm achieves excellence with a precision of 97.8
Ananya Joshi .et.al/2022 [4]: Researchers used transfer learning for sev-
eral of the deep learning models to identify which one should be utilized
for the suggestion of legal sections from complaint narratives. Seven fea-
ture extraction methods which exist in NLP have been compared with five
different metrics such as accuracy and F1-score. In such analysis, it has
been determined that a fine-tuned RoBERTa pre-trained model combined
with the SVM classifier obtained an accuracy of 99.5 percentage.
Priya Das .et.al/2022 [5]: A scratch-trained deep learning network has
been trained on a custom Indian legal document dataset based on data

9
augmentation with a combined loss function. Promising results have been
yielded from this approach that has shown effectiveness in identifying key
entities like ’victim’, ’accused’, and ’offense’ with high accuracy. Our pro-
posed method compared with the existing legal text analysis approaches
provides a promising alternative, and the modification applied at its cost
allows significant improvements in results. The architecture may be ap-
plied to a wide range of legal document analysis tasks, outside the appli-
cation studied.
Sanjay Kumar.et.al/2023 [6]: This paper will concentrate on how to au-
tomatically suggest appropriate legal sections from complaint narratives by
computer programs. They will compare the method of classical computer
programs (machine learning like TF-IDF) with a relatively new and yet
more powerful method known as deep learning. Deep learning is depen-
dent on a type of artificial intelligence known as a Transformer Network to
analyze text. The Transformer proved to be significantly better than the
classical methods and, in many settings, it achieved nearly 97 accuracy.
However, in a few instances, the researchers were unable to differentiate be-
tween civil and criminal matters with overlapping language. The authors
suggest that this technology should be advanced further so that these two
domains can be differentiated in an ideal way. For this, legal-domain-
specific datasets should be designed. The findings of this work show that
deep learning seems promising in the analysis of legal text, still, there is
much scope for improvement.

Nidhi Gupta.et.al/2022 [7]: They proposed two state-of-the-art deep

learning architectures to analyze legal text. Two deep learning architec-
tures designed to generalize, both with excellent performance across large
numbers of FIRs and court documents, with the fewest examples where
information may have been compromised. To enhance the accuracy of
one of these models even more, the scientists introduced a specific method
of data augmentation by paraphrasing legal text. The experiments show
good promise for these models: 96.5 and even 99 accuracy on distinct
test sets were obtained. The results significantly outperform the state of
previously published techniques.

Md Ishtyaq Mahmud.et.al/2023 [8]: This work answers the problem

by training a dedicated computer program (CNN) for a large number of
samples of brain scans. The objective is early classification of brain tumor.
This work is tested on many alternative approaches in comparison to other

10
existing programs too. However, one limitation we came across was that
the training time is too long because to a less powerful computer. This
training time would double then, too with even larger Datasets. This other
information may arise from any of the following sources, including medical
history. More direct wording is used in this edition, with definitions of
technical Simplifies sentence styles with terms and easy-to-read language.

Amit Singh.et.al/2023 [8]: This work answers the problem by training a

dedicated computer program (Transformer) for a large number of samples
of legal complaints. The objective is early and accurate suggestion of
legal sections. This work is tested on many alternative approaches in
comparison to other existing programs too. However, one limitation we
came across was that the training time is too long because of a less powerful
computer. This training time would double with even larger datasets. This
other information may arise from any of the following sources, including
case law history.

11
Table 2.1: Literature Review

Table 2.2: Literature Review

12
2.2 Research Gaps
1. Limited Availability of Annotated Legal Datasets
Problem: There are very few high-quality, annotated, and publicly acces-
sible datasets of Indian legal complaints (FIRs) due to significant privacy
concerns and the difficulty in getting accurate, section-level annotations
from legal experts.
Impact: This scarcity limits the model’s ability to generalize across the
highly diverse population of criminal cases, which have wide variations in
narrative style and complexity.
Future Direction: Investigate text-based data augmentation techniques
or build synthetic legal datasets with the help of Generative Adversarial
Networks (GANs) and Large Language Models (LLMs).

2. Class Imbalance in Crime Data

Problem: Datasets of legal complaints will inherently have a skewed dis-
tribution. Common offenses like theft will be heavily overrepresented com-
pared to rarer, more complex crimes like specific cybercrimes or organized
financial fraud.
Impact: This leads to biased models that favor suggesting sections for the
majority class (common crimes) and exhibit poor performance on minority
classes, despite their high importance.
Future Direction: Apply oversampling (e.g., SMOTE), undersampling,
or implement cost-sensitive learning algorithms to address this class im-
balance during model training.

3. Lack of Explainability and Interpretability

Problem: Sophisticated NLP models, like Transformers (e.g., BERT),
are often considered ”black boxes,” and it is not transparent to a police
officer why a specific legal section was suggested.
Impact: This lack of trust and transparency severely reduces the likeli-
hood of adoption in a real-world legal environment where accountability
is paramount.
Future Direction: To address this, integrate explainability techniques
like LIME or SHAP to provide model decisions that visualize which specific
words or phrases in the complaint led to a suggestion.

4. Generalization Across Languages and Jurisdictions

Problem: An NLP model is commonly trained with a dataset in a sin-

13
gle language (e.g., English) but in actual practice in India, complaints are
filed in numerous regional languages and dialects with unique legal collo-
quialisms.
Impact: The usage of the model is thus restricted only to the specific
linguistic and jurisdictional scenarios that the training data was exposed
to.
Future Direction: Train and test models on multilingual, code-switched
datasets and explore cross-lingual transfer learning to improve versatility
and national-level applicability.

5. High Computational Requirements

Problem: Large language models, being deep and complex, require signif-
icant computational resources (GPUs, high RAM), making it challenging
to deploy them in low-resource settings or on standard police station com-
puters.
Impact: This digital divide hinders its application and scalability in re-
mote, rural, or underdeveloped areas where such a tool might be needed
most.
Future Direction: Investigate lightweight NLP architectures (e.g., Dis-
tilBERT, MobileBERT) or apply model compression techniques like quan-
tization and pruning to optimize performance.

6. Difficulty in Detecting Subtle or Complex Crimes

Problem: The standard mechanisms of NLP models might fail to grasp
the context of subtle, multi-layered, or irregularly described crimes that
are not explicitly stated.
Impact: Challenging cases involving conspiracy, abetment, or complex
fraud are more likely to result in incomplete section suggestions or misdi-
agnosis of the crime’s gravity.
Future Direction: Add attention mechanisms or use hybrid models, such
as a combination of Transformers and Graph Neural Networks (GNNs), to
better map the relationships between entities and events in a narrative.

7. Lack of Robustness to Noise and Artifacts

Problem: Real-world legal complaints often contain significant noise,
such as spelling errors, grammatical mistakes, emotional language, and
variations in transcription quality from voice-to-text.
Impact: This noise reduces the robustness and reliability of the model’s
predictions, leading to inaccurate suggestions.

14
Future Direction: Employ more advanced text preprocessing techniques,
including robust spell-checkers, slang normalization, or domain adaptation
methods that can make the model more resilient to noisy data.

8. Narrow Focus on Real-Time Section Suggestion

Problem: The majority of studies tend to focus more on attaining high
accuracy for section suggestion, but at the cost of not being optimized for
other crucial tasks like entity extraction (names, places) or generating a
structured event summary.
Impact: This hinders integration into the complete police workflow where
a comprehensive summary and structured data are crucial for subsequent
investigation.
Future Direction: Evolve the model into a multi-task learning frame-
work that simultaneously suggests sections, performs Named Entity Recog-
nition (NER), and generates case summaries.

9. Ethical and Privacy Concerns

Problem: Using and sharing real complaint data for training and eval-
uation poses severe ethical issues regarding patient/complainant privacy
and data security, especially concerning Personally Identifiable Informa-
tion (PII).
Impact: This legally and ethically restricts data sharing and collabora-
tive research, slowing down progress in the field.
Future Direction: Develop federated learning frameworks to train mod-
els on decentralized data without sharing the raw text. Implement robust
PII anonymization protocols during preprocessing.

10. Not Integrated with Official Police Systems

Problem: Current models are often developed as standalone proof-of-
concepts and are not integrated into more general police workflow systems
like the Crime and Criminal Tracking Network & Systems (CCTNS).
Impact: This siloed approach reduces the practical usability of such mod-
els in police stations, as it forces officers to perform redundant data entry
and switch between multiple systems.
Future Direction: Develop secure APIs to link the model’s output with
official police decision systems to support in-depth investigation and pro-
vide a seamless workflow.

15
2.3 Objective of Project
1. Develop a Legal Section Suggestion System :
To design and implement a model using Transformer networks, specifically
leveraging a fine-tuned BERT architecture, for automated analysis of legal
complaint texts and suggestion of appropriate penal sections.

2. Enhance Model Accuracy :

To improve the accuracy and reliability of the legal section suggestion
model by employing advanced NLP techniques such as transfer learning,
text-based data augmentation, and hyperparameter optimization.

3. Address Dataset Challenges :

To preprocess legal text datasets to handle challenges such as noise (spelling
errors, colloquialisms), class imbalance of crime types, and unstructured
text, ensuring that the model generalizes well across diverse complaint
narratives.

4. Optimize for Real-World Use :

To reduce computational requirements and optimize the model for deploy-
ment in real-time scenarios, such as at a police helpdesk or on standard-
issue computers.

16
Chapter 3

Methodology Used
This section describes the proposed methodology, which can classify legal
complaints and suggest appropriate penal sections using NLP-based deep
learning techniques. This study aims to objectively identify applicable
legal statutes from complaint narratives through the use of modern deep
learning techniques based on natural language understanding and text
classification. The designed system takes as input raw legal complaint
texts and then performs a number of text processing steps towards legal
section suggestion. In this first step, the input text is preprocessed using
a Transformer-based model that aids in improving the quality of the text
by correcting spelling, normalizing slang, and rejecting noise.

Figure 3.1: Flow Diagram of Proposed Model

Chapter 3: Methodology

This section describes the proposed methodology, which can classify

legal complaints and suggest appropriate penal sections using Nat-

17
ural Language Processing (NLP) and deep learning techniques.
This study aims to objectively identify applicable legal statutes from
complaint narratives through the use of modern deep learning tech-
niques based on natural language understanding and text clas-
sification. The designed system takes as input raw legal complaint
texts and then performs a number of text processing steps towards
legal section suggestion. In this first step, the input text is prepro-
cessed using a Transformer-based model that aids in improving
the quality of the text by correcting spelling, normalizing slang, and
rejecting noise. 3.1 Text Acquisition

This is the process in which text data is collected for further pro-
cessing. In this project’s case, these texts are legal complaints or
First Information Reports (FIRs) that will be collected from the
end-users through the user interface created on the web application.
3.2 Text Pre-processing
In this process, the text collected from the end-users is pre-processed
to remove noise (e.g., irrelevant characters, spelling errors) and to
make it useful for further processing. Raw text data is often unstruc-
tured and difficult to analyze directly. Pre-processed data is easier for
the model to use and analyze. 3.3 Entity & Intent Recognition
This is a technique used to simplify a large body of text by splitting it
into various parts, such as identifying key entities (names, locations,
dates) and the primary intent of the complaint. This makes it easier
for the model to analyze the text for subsequent studies. 3.4 Feature
Extraction
In this process, useful semantic features are extracted from the large
text dataset by converting words and sentences into numerical repre-
sentations (embeddings). This eliminates the need for manual feature
engineering and helps the model understand the context and meaning
of the complaint. 3.5 Model Comparison
This process involves comparing the performance metrics (like accu-
racy, precision, F1-score) that are obtained from various NLP models
or algorithms. This is used to find the best-performing model for the
task. 3.6 Classification / Section Suggestion
The final step is the classification of the complaint into predefined
crime categories and the suggestion of appropriate legal sections, dis-

18
playing the results with the highest possible accuracy.

We are presenting the outcomes of our proposed work that is ”Analysis

of Legal Complaints for Automated Section Suggestion us-
ing Deep Learning”. It is especially designed to acquire efficient
and accurate legal insights from text. We begin with text data aug-
mentation and further towards classifying the nature of the crime and
suggesting relevant legal sections.
For our work, we needed high computational power (GPU) for analyz-
ing and executing the training of our large language model. We tested
our method on various legal complaint texts (including different types
of crimes).
The algorithm takes legal complaint text as input and performs sev-
eral text processing operations on it to perform the function of iden-
tifying applicable legal sections. The first process is preprocessing
the input text using NLP techniques to enhance its quality by
handling noise and normalizing the language. In addition, word
and sentence embeddings are used for the extraction of seman-
tic features like context, meaning, and intent. Finally, a deep learn-
ing Transformer-based mechanism (like BERT) is used for the
prediction and classification for the various forms of offenses.

3.1.1 Dataset:
For an investigation on deep learning analysis in the suggestion of
legal sections from complaint texts, we need a well-curated dataset,
which comprises legal complaints along with annotations that indicate
the correct corresponding legal sections.
Here are some recommendations to acquire or design an appropriate
dataset:
– Public Datasets: One should begin to look for public datasets
of legal documents, such as court judgments or anonymized FIRs
that are sometimes made available for research.
– Kaggle: Go through relevant text datasets available on Kaggle.
There are a number of datasets related to legal text classification
and analysis.
– Collaborate with Law Firms or Police Departments: You
can contact legal professionals or institutions to get de-identified

19
datasets of legal complaints. One needs to follow all the ethi-
cal and legal provisions applicable in the case of such sensitive
information.

3.1.2 Data Augmentation: It is a trick in a machine learner’s hat!

When one has limited training data, it cooks up new variations from
existing samples. This may involve paraphrasing sentences, re-
placing words with synonyms, or back-translation (translating
text to another language and back). Because it increases the diversity
of the training data, data augmentation helps NLP models to be more
adaptable and less prone to overfitting on the original data.

3.1.3 Pre-processing: Deep learning models need deep preparation

in terms of preprocessing of the data. For a project on suggesting legal
sections from complaint texts, the following pre-processing might be
involved:
– Data Cleaning: Remove any irrelevant characters, extra whites-
pace, or corrupted data present in your text files. Ensure that
the dataset is clean and anomaly-free.
– Normalization: Normalize the text by converting it to a stan-
dard case (e.g., lowercase) and handling slang or colloquial terms.
This helps the model treat similar words consistently.
– Tokenization & Padding: Reshape all text inputs to a stan-
dard and usable length. Text is split into tokens (words or sub-
words), and sequences are padded or truncated to a fixed size, as
most deep learning models work with fixed input dimensions.
– Label Encoding: If your labels (legal sections) are in a cate-
gorical format (e.g., ”IPC 302”), we need to translate them into
a numerical format that can be used to train models.

3.1.4 Training/Testing/Validation Set: While working on most

machine learning or deep learning projects, we have to split our
dataset into three sets: training, validation, and testing. Here is a
general outline of how we could split it for our project:
– Training Set: The deep learning model uses this training set in
the training process. It should make up 70-80% of your dataset.

20
In training, it learns patterns and features from the input texts
and their corresponding legal section labels.
– Validation Set: The validation set is used for hyperparame-
ter tuning and to monitor the performance of the model during
training. It assures us of the way our model generalizes with new,
unseen data. As a thumb rule, allocate about 10-15% of your
dataset to the validation set.
– Testing Set: The testing set is kept completely isolated from
the training and validation sets and is used in judging the final
performance of our learned model. It provides an unbiased esti-
mate of how well our model will most likely do on new, real-world
data. Reserve the last 10-20% of your dataset for the testing set.

3.1.5 BERT (Bidirectional Encoder Representations from

Transformers): BERT is a language representation model devel-
oped by Google. In 2018, it was introduced and set forth to achieve
state-of-the-art performance on a wide range of Natural Language
Processing tasks. This indicates its immense power in the deep learn-
ing approach towards language understanding.
BERT is a very deep, very powerful language model. The key concepts
are:
– Input: A fixed-length sequence of tokens (words or sub-words).
– Transformer Encoder Blocks: Layers of ”building blocks”
that understand context bidirectionally.
∗ Each block uses a Self-Attention mechanism to weigh the
importance of different words when processing any single word
in a sentence.
∗ Feed-Forward Networks process the output of the atten-
tion layers.
– Output Layer: A classification layer that takes the processed
text representation and learns to map it to the final output.
– Softmax Layer: Produces the likelihood of a complaint corre-
sponding to various legal sections.

3.1.6 Feature Extraction: In this process, useful semantic features

are extracted from the text. The Transformer model converts the raw

21
text into high-dimensional numerical vectors (embeddings) that cap-
ture the context and meaning. This is helpful for further analysis, as
the model can work with these meaningful numerical representations.

3.1.7 Model Comparison: This process involves comparing the

features and performance metrics that are obtained from various tech-
niques or NLP models (e.g., BERT vs. DistilBERT). This is used to
find the best algorithm or model for our specific task.

3.1.8 Classification: The final step is the classification of the com-

plaint and suggesting the different types of legal sections, representing
the result very accurately with the help of the web application/page.

22
Chapter 4

Designing of Project
4.1 0-Level DFD

Figure 4.1 enlists the general flow of a brain tumor detection system,
considering the general interaction between patients, doctors, medical
equipment, and the system itself.

Figure 4.1: 0-level DFD

1. Patients : This process is started on patients who require a

diagnosis of brain tumors. Patients are subjected to MRI scans, which
remain the input to the detection system of brain tumors. The MRI

23
images become vital data in identifying and even classifying the brain
tumors.
2. Brain Tumor Detection System : This is the base system of
the entire process. It incorporates MRI images, processes the images,
and feeds algorithms associated with machine or deep learning tech-
niques, including CNNs, to discern whether or not an individual has
a brain tumor. If a tumor exists, then it further categorizes the type
as malignant or benign.
- The system relies on medical equipment for quality images from
MRI. The machinery produces images that are clear and accurate,
thereby giving a correct diagnosis. From the analysis, the system pro-
duces output, showing the existence of a tumor, classification, size,
position, among other relevant facts.
3. Doctors: The output obtained by the brain tumor detection
system is communicated to doctors. In this regard, it provides health
care professionals with sufficient information to be able to make an
appropriate decision toward the diagnosis and treatment of the pa-
tients. The doctors can utilize the outcome to diagnose the condition,
develop treatment strategies, and provide proper prognosis to the pa-
tients.

4.2 1-Level DFD

Figure 4.2 is the workflow of developing a brain tumor detection and

classification model with Logistic Regression (LR).

24
Figure 4.2: 1-level DFD

It is segmented mainly into four phases; therefore, it can be accord-

ingly divided into four blocks: Image Preprocessing, Image Segmenta-
tion, Feature Extraction, and LR Model Training; hence, the resulting
product is the trained Logistic Regression (LR) Model. Here are the
details in detail:
1. Image Pre-processing (1.1) : This is a training image prepa-
ration stage. It falls into the operation stage whereby the following
operations may be included: resize, normalization, contrast enhance-
ment, and noise reduction which ensure that the quality in the images
is maximized. These operations take the input data into one standard,
thus appropriate for further processing and overall accuracy with the
model.
2. Image Segmentation (1.2) : It segments the images after pre-
processing. The regions of interest, in this case, include the tumor
areas, which get segregated from the rest of the brain tissues during
the segregation procedure. Separation is a process that isolates the
tumor for further study. The techniques include thresholding and
region-based segmentation or edge detection to define their bound-
aries.
3. Feature Extraction (1.3): After segmentation of the tumor ar-

25
eas, key features are to be extracted from such areas. Feature extrac-
tion is defined as the process of quantifying important characteristics
such as: Tumor size, shape, texture, and intensity. These features
are used as the input by the classification model. Feature extraction
transforms image data into numerical values describing the tumor;
4. Training the LR Model (1.4): This stage requires utilization
of the extracted features to train the Logistic Regression (LR) model.
This example does learn the logistic regression model on a labeled
dataset to map input features with output classes. For brain tumor
classification, the logistic regression model learns how to classify the
two major types of tumors: benign and malignant by their respective
features extracted.

4.3 2-Level DFD

In Figure4.3even refers to the full workflow diagram of the modified

ResNet50 model, mostly for detection and classification of brain tu-
mors. It follows this discussion up to the classification of brain tumors
from data preprocessing.
1. Watershed Enhanced Segmentation
Better watershed segmentation as it separated the tumor from the
background of the brain. Then, morphological restoration and markup
extraction that fine-tuned the result image such that it was well seg-
mented for the tumor.
2. Data Set Creation Warehouse the segmented images in a
dataset. The dataset shall be applied when training and testing the
machine learning model.

26
Figure 4.3: 2-level DFD

3 Preprocessing
All the images are resized to have uniformity as per the following size.
Standardises pixel intensity values and quality enhancements in the
images by removing the variations captured in the images,
Thresholding and Denoising: This would remove the unwanted noises
and artifacts in the images that would be needed or not, therefore
giving prominence to the regions of the tumor.
Contour Detection : This will enable tracing the contours of the tumor
that existed in the brain image .
Extreme Points Detection : It will help detect the extreme points of
the tumor so that proper image segmentation occurs.
4. Image Augmentation
It artificially increases the dataset by having techniques applied in
the augmentation of data. This will provide the ability to generalize
through making variations within data, for instance through rotation,
scaling, or flipping.

27
5. ResNet50 Model The preprocessed and augmented data will
be passed to the modified version of the ResNet50 model-it’s really
deep network-densely applied into the problems with more than two
classes, like image classification but not limited to that. The model
is trained on the task of tumour classification and detection.
6. Classification Output : Finally, it outputs the classification
result as benign or malignant by using the features obtained from
images. Workflow This workflow achieves the preciseness of classi-
fication for a given brain tumor through the process that combines
preprocessing, segmentation, and deep learning.

4.4 Use Case Diagram

Below Figure 4.4depicts a use case for a system of brain tumor detec-
tion and classification involving two main actors, namely the Devel-
oper and the User .

Figure 4.4: Use Case Diagram

1. Role of Developer
Developer deals with the back-end of the system. Such big tasks are:
– Input: MRI images This sets the system to accept the MRI
images created by it.

28
– Data Preprocessing Cleaning, normalization, and enhancement
of images for good quality data.
– Feature Extraction : Relevance features from the MRI images,
such as size, texture, or intensity, have been among the chief
factors for classification.
– Training : It is trained with features retrieved by machine learn-
ing algorithms like deep models.
– Feature Matching : The system, after being trained, uses
learned features to classify and detect the tumours after matching
newly input images.
2. Users Role
The user then interacts with a front-end interface, which is simpler to
use for the system. Some of the most critical activities in this regard
are:
– Select Image : The patient should be required to select an MRI
image for the input analysis.
– Upload MRI Image : The system should accept an uploaded
MRI image from the patient.
– Get Prediction : The user will get the prediction that the
system computes based on its model as to what kind of tumor,
whether it is benign or malignant, is going to be.

29
Chapter 5

Detailed Designing of Project

5.1 Class Diagram

The following Figure 5.1represents such a detailed workflow and class

structure of a supposed system to automatically detect and classify
brain tumors from MRI images.Such an intended system can be bro-
ken into primary categories: it includes pre-processing, feature ex-
traction, segmentation, and classification, respectively. Further elu-
cidation is needed in describing the steps of components and the in-
terrelationship between these. Class of interface allows the user to
access the system. The major methods include:
– loadMRIImage(input) : Loads the MRI image of the brain.
– getTypeOfTumor() : It returns the type of classification result
determines which of the tumors is benign or malignant.
– getSegmentedTumor(): This serves to return the segmented
area in the MRI scan where the tumor is located.
1. Pre-processing : Image enhancement stage- here, most tech-
niques enhance the quality of images and also include:
– Skull Stripping : Remove non-brain tissues, such as the skull
to focus more on the regions of the brain.
– Normalization : It is mainly done to standardize image inten-
sities for better analysis
– Denoising : Noise in MRI images is reduced such that clearer
identification of higher features will be provided

30
Figure 5.1: Class Diagram

2. Feature Extraction : After the pre-processing, the key features

of the MRI image are extracted. The features include the statisti-
cal parameters that are: Mean, Standard Deviation, Entropy,
Skewness, and much more giving a description regarding the texture
and shape of a possible tumor. Input to classification model.
3. Segmentation : Break the image into regions of interest by seg-
regating abnormal tissues (tumors) from normal brain tissues. This
is through
SVM : It is a learning algorithm that gets to work on extracted fea-
tures and then it will be capable of classifying regions to belong to
either one class as tumor and the other as non-tumor.
4. Classification : It determines whether the identified tumor is
either Benign or Malignant. And it is achieved through:
– Drawing up a bipartite graph based on the segmented images.
– Maximal Match classification-identify the type of tumor by re-
trieved stored models in a database.
Briefly speaking, the system works step by step from preprocessing
of the brain MRI image, feature extraction, segmentation, and classi-
fication for correct results in the detection and classification of brain
tumors. In this paper, statistical analysis along with various ma-

31
chine learning techniques like SVM plus graph-based classification
have been performed for robust performance.

5.2 Entity-Relationship Diagram (ERD)

This figure 5.2attached image is the Entity-Relationship (ER) Dia-

gram of a system that detects and classifies brain tumors.

Figure 5.2: ER Diagram

It is actually the description of the relations which may exist between

several entities in the process, like patients, doctors, scans, and diag-
nostic results.
1. Patient Attributes: Patient ID, Name, Age, Gender, Contact
Details, Medical History.

32
Medical History of the patient is noted and scanned times, and he is
diagnosed accordingly.
2. Doctor : Attributes : Doctor ID, Name, Speciality, Experience,
Contact Details. Diagnose for the scan is selected that comes with
reading of multiple reports prepared by various doctors.
3. Scan : Attributes: Scan ID, Scan Resolution, Scan Date, Image
File. Every Scan is provided to the patient and results in association
with classification and detection of tumor.
4. Detection of Tumor : Attributes: Detection ID, Tumor Present,
Tumor Size, Tumor Location, Confidence Score. This is also kept
there to state if a tumor exists or not and then combined with other
information like its size, location, and even the extent to which one
believes the system will be accurate in detecting.
5. Tumor Classification : Attributes: Classification ID, Tumor
Type, Risk Level, Classification Accuracy. Once there is a tumor, the
system will diagnosis if the tumor is benign or malignant and its likely
chance of the cancer it has and also gives an accuracy classification
of the cancer.
6. Report : Attributes: Report ID, Generation Date, Diagnosis,
Recommended Treatment. This report would be produced based on
the results of the scan and the classification of the tumor as a diagnosis
along with appropriate treatment recommendations if needed.

5.3 Activity Diagram

In this figure5.3 schematic explains the process by which a multi-

step machine learning technique performs the task of detection and
classification of brain tumors from an MRI image. It has different
significant steps on its workflow:
1. Brain MRI Image Acquisition : It first acquires MRI scans of
the brain, which are basically provided as input to the system.
2. Pre-processing: The MRI images are now preprocessed in a
bid to enhance the quality of images so acquired by removing noise,
normalization of data as well as enhancing their contrasts for better
reputation fit to serve purposes of analysis.

33
Figure 5.3: Activity Diagram

3. Segmentation : In this phase, the picture or image gathered

from the brain is segmented. Thus, regions containing probable tu-
mours are separated from other tissues. In this segmentation process,
regions of interest are defined based on which the tumors propagate
themselves.
4. Feature Extraction : In the case at hand, the relevant charac-
teristics are the texture, shape, or size in the regions of interest after
segmentation that would describe the normal tissues and the abnor-
mal tissues.
5. Feature Reduction : In this step, the redundancy and irrele-
vance features are to be eliminated since in this process, it supposedly

34
reduces the total number of features only to the ones that are most
significant in ensuring accurate classification.
6. Training and Classification Phase 1: These features that are
obtained are fed into the trained learning machine model that was
pre-trained to classify the image as ”Normal” or ”Abnormal”. If it
classifies as ”Normal”, then no other process is called for. If it is
classified as ”Abnormal”, then the given operation is passed on to the
next phase.
7. Classification : Step 2 Such images can be classified as abnor-
mal. These abnormalities are then subclassified into benign, which is
non-cancerous, and malignant, which is cancerous classification. As
classified, are thus used in guiding the treatment plans.

5.4 Sequence Diagram

The following diagram 5.4 gives a sequence flow in the detection of a

brain tumour, using techniques both in image processing and machine
learning.
There are phases, actually several within the process, from the origi-
nal image up to the final result. For this answer, I elaborate step by
step.

Figure 5.4: Sequence Diagram

35
1. Original Image : It starts with an MRI picture or picture of
any other brain. That picture forms the raw data source to be fed in
further for processing .
2. Thresholding : It is the first pre-processing step in which me-
dian filtering is applied on an image to remove noise. Noise removal
enhances the quality of the image by making it smooth without los-
ing the essential features of the image, which are edges. - Extraction
: Features of interest or the region of interest is extracted from the
image after filtration. Such regions might be potential tumors and
hence are focused upon to extract them.
3. Segmentation : This algorithm implements the usage of K-
Means clustering with which it clusters the image. In the process, it
groups the pixels into various clusters based on the resemblance, to
separate the tumor area from the normal brain tissue. This step of
segmentation would very well help to isolate the region of tumor to
further analyze.
4. SVM Classifier : From this classification, the classified image is
transferred to the SVM classifier, which in reality uses SVM. SVM is
one of the binary classification algorithms which applies data-driven
rules in a decision to classify as benign or malign.
5. Result : Such an output will be generated based upon the predic-
tion that SVM has made in the result and therefore, it will come up
with the output as benign tumors or malignant tumors. Therefore,
such a type of output will further deliver the result towards the user
for diagnosis and treatment planning in the medicine field.

36
Bibliography

[1] Lhoussain Bahatti Driss Lamrani Bouchaib Cherradi Oussama

El Gannour, Mohammed Amine Bouqentar. Brain tumor detec-
tion using mri images and convolutional neural network. Interna-
tional Journal Of Advanced Computer Science And Applications,
2022.
[2] S. Gopalakrishnan E. Aarthi S. Jana W. Gracy Theresa M. Krish-
namurthy A. S. Prakaash, C. Senthilkumar. Detection and classi-
fication of mri brain tumors using s3-drlstm based deep learning
model. International Journal Of Electrical And Electronics Re-
search (IJEER), 2022.
[3] Syed Ali Nawaz Muhammad Khan, Salman Qadri. Brain tumor
classification based on hybrid optimized multi-features analysis
using magnetic resonance imaging dataset. Taylor And Francis,
APPLIED ARTIFICIAL INTELLIGENCE, 2022.
[4] Pallab K. Choudhary and Saif Ahmed. Brain tumor detection
using mr images. IEEE Access, 2022.
[5] Nosheen Sohail and Syed Muhammad Anwar. State-of-the-art in
brain tumor segmentation and current challenges. IEEE, 2022.
[6] Siddharth Singh Chouhan Shubhangi Solanki, Uday Pratap Singh
and Sanjeev Jain. Brain tumour detection and classification by
using deep learning classifier. IJISAE, 2023.
[7] Tanoy Debnath Md. Razaul Karim Mostofa Kamal Nasir Shahab
S. Band Amir Mosavi Md. Saikat Islam Khan, Anichur Rahman
and Iman Dehzangja. Accurate detection of the brain tumor using
deep learning cnn. NIH, 2022.
[8] Ahmed Abdelgawad Md Ishtyaq Mahmud, Muntasir Mamun. A
deep analysis of brain tumor detection from mr images using deep
learning networks. MDPI, 2023.

37
Appendix A

https://www.kaggle.com/datasets/adarshsingh0903/legal-dataset-sc-judgments-
india-19502024/data

Figure 5.5: Screenshot Of Database

38
Appendix B

Screenshot 2024-12-31 072329.png

Figure 5.6: Research paper

39
Appendix C

SDG 3: Good Health and Well being

Target 3.4 : By 2030, reduce by one third premature mortality from
noncommunicable diseases (NCDs) through prevention, treatment and
promotion of mental health and well-being
SDG 9: Industry, Innovation, and Infrastructure
Target 9.5 : Promote scientific research, improve technological capa-
bilities, and increase innovation in all countries, especially developing
ones.
SDG 10 - Reduced Inequalities
Target 10.3 : Ensure equal opportunities and reduce inequalities of
outcome by eliminating discriminatory practices and promoting ap-
propriate legislation, policies, and actions.
SDG 17: Partnerships for the Goals Target 17.6 : Enhance
international cooperation in science, technology, and innovation to
share knowledge and strengthen global partnerships.

40
Screenshot 2024-12-17 183159.png

Figure 5.7: SDG

41
Appendix D

Figure D.1 shows the plagiarism report of our project, which is 8

percentage

42
Figure 5.8: Plagiarism Report

43
Figure D.2 shows the AI report of our project, which is less than 20
percentage

Screenshot 2024-12-22 213604.png

Figure 5.9: AI Report

Ilovepdf Merged
No ratings yet
Ilovepdf Merged
28 pages
Table
No ratings yet
Table
5 pages
17mini-D22 Report PDF
No ratings yet
17mini-D22 Report PDF
37 pages
Spring Boot Application Using Three Layered Architecture by Sarthak Kumar
No ratings yet
Spring Boot Application Using Three Layered Architecture by Sarthak Kumar
63 pages
Hotel Management System Project Report
No ratings yet
Hotel Management System Project Report
50 pages
Online Library Management System Report
50% (2)
Online Library Management System Report
38 pages
Weather Forecasting-3
No ratings yet
Weather Forecasting-3
39 pages
Franco Update
No ratings yet
Franco Update
110 pages
Python Project
No ratings yet
Python Project
21 pages
Final Report
No ratings yet
Final Report
59 pages
SG Vu Report
No ratings yet
SG Vu Report
57 pages
Sample
No ratings yet
Sample
9 pages
Expenses Calculation Mechanism For Any App .
No ratings yet
Expenses Calculation Mechanism For Any App .
69 pages
Fina LLLLL
No ratings yet
Fina LLLLL
70 pages
Analysis On Credit Card Fraud Detection Using Machine Learning Approaches
No ratings yet
Analysis On Credit Card Fraud Detection Using Machine Learning Approaches
10 pages
20BCM514 - Dynamic Website Hosting Using AWS
No ratings yet
20BCM514 - Dynamic Website Hosting Using AWS
56 pages
Attachment
No ratings yet
Attachment
36 pages
AI Virtual Assistant Mini Project Report
No ratings yet
AI Virtual Assistant Mini Project Report
31 pages
"Library Mannagement Sysytem": Visvesvaraya Technological University "JNANA SANGAMA", Belagavi-590018, Karnataka
No ratings yet
"Library Mannagement Sysytem": Visvesvaraya Technological University "JNANA SANGAMA", Belagavi-590018, Karnataka
7 pages
Batch 22
No ratings yet
Batch 22
40 pages
Harshaqe Main Project
No ratings yet
Harshaqe Main Project
69 pages
Sample Major Project - 1 Report
No ratings yet
Sample Major Project - 1 Report
36 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
158 pages
1922 B.SC Cs Batchno 38
No ratings yet
1922 B.SC Cs Batchno 38
61 pages
Roshini Project
No ratings yet
Roshini Project
74 pages
Library Management System
67% (3)
Library Management System
53 pages
PIREEE236
No ratings yet
PIREEE236
35 pages
Project Doc (1) (1) 32
No ratings yet
Project Doc (1) (1) 32
39 pages
Sample Project Documentation
No ratings yet
Sample Project Documentation
163 pages
ITA Project
No ratings yet
ITA Project
57 pages
Final Mini Project123-1
No ratings yet
Final Mini Project123-1
56 pages
Finalreport
No ratings yet
Finalreport
44 pages
Health Mental Ai Driven Companion
No ratings yet
Health Mental Ai Driven Companion
54 pages
B.tech It Batchno 178
No ratings yet
B.tech It Batchno 178
18 pages
Travel Bill Tracking System Report
0% (1)
Travel Bill Tracking System Report
10 pages
Tour Management 01
No ratings yet
Tour Management 01
72 pages
New Project Documentation Format
No ratings yet
New Project Documentation Format
6 pages
ATM Simulator Project Report in Java
No ratings yet
ATM Simulator Project Report in Java
57 pages
Part 1 Blockchain
No ratings yet
Part 1 Blockchain
10 pages
Semester Training Report at Novem Controls
No ratings yet
Semester Training Report at Novem Controls
62 pages
Mini Project 125
No ratings yet
Mini Project 125
44 pages
Sem 1 Report
No ratings yet
Sem 1 Report
73 pages
Stating
No ratings yet
Stating
11 pages
SOAP: SCUBA Oxygen Analysis Project: Final Report
No ratings yet
SOAP: SCUBA Oxygen Analysis Project: Final Report
46 pages
E Commerce
No ratings yet
E Commerce
28 pages
Lenin Peter
No ratings yet
Lenin Peter
91 pages
Naman Appendix 1
No ratings yet
Naman Appendix 1
9 pages
Hehehehe
No ratings yet
Hehehehe
77 pages
Crime Rate Analysis Using Machine Learning Final
100% (1)
Crime Rate Analysis Using Machine Learning Final
37 pages
Revenue Appraisal System Overview
No ratings yet
Revenue Appraisal System Overview
8 pages
Model PRJCT Java
No ratings yet
Model PRJCT Java
103 pages
Smart Car Parking System Report
No ratings yet
Smart Car Parking System Report
18 pages
Flexi Ansh Report
No ratings yet
Flexi Ansh Report
77 pages
505 Mini
No ratings yet
505 Mini
59 pages
TABLE OF CONTENT (Dhanush)
No ratings yet
TABLE OF CONTENT (Dhanush)
5 pages
13 - Integrated Knowledge & AI Technology in Farming
No ratings yet
13 - Integrated Knowledge & AI Technology in Farming
50 pages
Mini Project
No ratings yet
Mini Project
65 pages
Sample Report
No ratings yet
Sample Report
5 pages
Final Year Project
No ratings yet
Final Year Project
25 pages
Unit - 4 BLOCKCHAIN IN FINANCIAL SOFTWARE AND SYSTEMS
No ratings yet
Unit - 4 BLOCKCHAIN IN FINANCIAL SOFTWARE AND SYSTEMS
13 pages
Blockchain Pyq Ut Faculty (It Department)
No ratings yet
Blockchain Pyq Ut Faculty (It Department)
112 pages
Case Study Blockchain
No ratings yet
Case Study Blockchain
6 pages
A Survey On Open-Vocabulary Detection and Segmentation: Past, Present, and Future
No ratings yet
A Survey On Open-Vocabulary Detection and Segmentation: Past, Present, and Future
27 pages
AI-Driven Rapid Identification of Bacterial and Fungal Pathogens in Blood Smears of Septic Patients
No ratings yet
AI-Driven Rapid Identification of Bacterial and Fungal Pathogens in Blood Smears of Septic Patients
28 pages
使用自反思大型语言模型学习生成可解释的股票预测
No ratings yet
使用自反思大型语言模型学习生成可解释的股票预测
20 pages
TAB Unified Benchmarking of Time Series Anomaly Detection Methods
No ratings yet
TAB Unified Benchmarking of Time Series Anomaly Detection Methods
15 pages
One Embedding To Predict Them All: Visible and Thermal Universal Face Representations For Soft Biometric Estimation Via Vision Transformers
No ratings yet
One Embedding To Predict Them All: Visible and Thermal Universal Face Representations For Soft Biometric Estimation Via Vision Transformers
10 pages
Key Techniques in DeepSeek Models
No ratings yet
Key Techniques in DeepSeek Models
11 pages
Predicting The Road Ahead: A Knowledge Graph Based Foundation Model For Scene Understanding in Autonomous Driving
No ratings yet
Predicting The Road Ahead: A Knowledge Graph Based Foundation Model For Scene Understanding in Autonomous Driving
18 pages
Synergy Over Discrepancy: A Partition-Based Approach To Multi-Domain LLM Fine-Tuning
No ratings yet
Synergy Over Discrepancy: A Partition-Based Approach To Multi-Domain LLM Fine-Tuning
29 pages
Text Rendering A Survey
No ratings yet
Text Rendering A Survey
36 pages
Attention Is All You Need
No ratings yet
Attention Is All You Need
18 pages
Autonomous Vehicle Trajectory Prediction
No ratings yet
Autonomous Vehicle Trajectory Prediction
14 pages
The Evolution of Deep Learning
No ratings yet
The Evolution of Deep Learning
53 pages
Multimodal - EOMulti-Spectral Image Fusion
No ratings yet
Multimodal - EOMulti-Spectral Image Fusion
35 pages
Mastering Tensor Dimensions in Transformers
No ratings yet
Mastering Tensor Dimensions in Transformers
12 pages
ChatBot Chapter 5
No ratings yet
ChatBot Chapter 5
31 pages
BERT-Based Model For Identifying Hate Speech and O
No ratings yet
BERT-Based Model For Identifying Hate Speech and O
20 pages
Sasidhar Alavala, Anil Kumar Vadde, Aparnamala Kancheti, Subrahmanyam Gorthi
No ratings yet
Sasidhar Alavala, Anil Kumar Vadde, Aparnamala Kancheti, Subrahmanyam Gorthi
3 pages
Efficient Multimodal Semantic Segmentation Via Dual-Prompt Learning
No ratings yet
Efficient Multimodal Semantic Segmentation Via Dual-Prompt Learning
11 pages
Compressed
No ratings yet
Compressed
30 pages
Generative Recommendation With Semantic IDs - A Practitioner's Handbook
No ratings yet
Generative Recommendation With Semantic IDs - A Practitioner's Handbook
6 pages
API v2 Epubs Urn Orm Book 9781098162665 Files Ch01.HTML
No ratings yet
API v2 Epubs Urn Orm Book 9781098162665 Files Ch01.HTML
10 pages
From Large To Mammoth - A Comparative Evaluation of Large Language Models in Vulnerability Detection
No ratings yet
From Large To Mammoth - A Comparative Evaluation of Large Language Models in Vulnerability Detection
18 pages
Vasu FastViT A Fast Hybrid Vision Transformer Using Structural Reparameterization ICCV 2023 Paper
No ratings yet
Vasu FastViT A Fast Hybrid Vision Transformer Using Structural Reparameterization ICCV 2023 Paper
11 pages
AI Powered Sentiment Analysis of Social Media Presence
No ratings yet
AI Powered Sentiment Analysis of Social Media Presence
5 pages
Online Recruitment Fraud (ORF) Detection Using Deep Learning Approaches
No ratings yet
Online Recruitment Fraud (ORF) Detection Using Deep Learning Approaches
66 pages
Index of Terms
No ratings yet
Index of Terms
28 pages
EPlantHealthNet Transformer-Enhanced Hybrid Models For Disease Diagnosis and Severity Estimation in Agriculture
No ratings yet
EPlantHealthNet Transformer-Enhanced Hybrid Models For Disease Diagnosis and Severity Estimation in Agriculture
17 pages
How To Use Generative AI in Education
No ratings yet
How To Use Generative AI in Education
74 pages
Bi Ref Net
No ratings yet
Bi Ref Net
14 pages
Electronics 13 04322
No ratings yet
Electronics 13 04322
17 pages