0% found this document useful (0 votes)
33 views7 pages

Drug Classification Using State of Art ML Algorithm

This research article explores drug classification using machine learning algorithms to enhance medication prescription accuracy based on patient attributes. Various models, including Logistic Regression, Support Vector Machine, and Random Forest, were evaluated, with Random Forest showing the highest accuracy. The study emphasizes the potential of machine learning in personalized medicine, aiming to improve patient care and treatment outcomes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views7 pages

Drug Classification Using State of Art ML Algorithm

This research article explores drug classification using machine learning algorithms to enhance medication prescription accuracy based on patient attributes. Various models, including Logistic Regression, Support Vector Machine, and Random Forest, were evaluated, with Random Forest showing the highest accuracy. The study emphasizes the potential of machine learning in personalized medicine, aiming to improve patient care and treatment outcomes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

International Journal of Scientific Research and Engineering Development-– Volume X Issue X, Year

Available at [Link]
RESEARCH ARTICLE OPEN ACCESS

Drug Classification Using State-of-Art ML Algo

MadhuBabu P, Chethan Manikanta B, Ragavendra B, Durga Prasad G, Hanok G


Department of IT
KKR & KSR Institute of Technology and Sciences, Guntur.
Email : madhusoft4u@[Link], manikantabonthala11@[Link], raghavendraboppudi214@[Link],
durgaprasadgadiyamula@[Link],gujjarlamudihanok@[Link]

---------------------------------------- ************************----------------------------------
Abstract:
Drug classification is a critical task in healthcare, allowing medical professionals to prescribe the
most suitable medication for patients. In this study, we explore a dataset containing patient information and
corresponding drug types. The objective is to develop machine learning models capable of accurately
predicting the appropriate drug type based on patient [Link] machine learning models, including
Logistic Regression, Support Vector Machine (SVM), and Random Forest, are trained and assessed based
on their accuracy in predicting drug types. Among these, Random Forest emerges as the most accurate
model. The study underscores the significance of personalized medicine facilitated by machine learning
techniques, offering valuable insights for medical practitioners in prescribing appropriate medications
tailored to individual patient profiles. Ultimately, this approach showcases the potential of machine learning
in healthcare decision-making, with the potential to enhance patient care and treatment outcomes.

Keywords — Drug classification, Patient information ,Machine learning ,Logistic Regression ,Support
Vector Machine (SVM),Random Forest ,Personalized medicine ,Healthcare.
---------------------------------------- ************************----------------------------------

INTRODUCTION LITERATURE REVIEW


In the realm of healthcare, prescribing the right The Machine learning model to do drug
medication tailored to individual patients is classification based on the blood pressure level,
paramount for effective treatment. Drug classification,
cholesterol and age of the patients to make
the process of assigning medications based on patient
characteristics, plays a vital role in achieving this
outcomes of suitable drugs. On the other hand, by
goal. With the advent of machine learning techniques, using machine learning model, doctors could
healthcare professionals now have powerful tools at reduce the human error and to avoid medical
their disposal to streamline this process and enhance negligence which can help increasing the
patient care. This project delves into the realm of drug efficiency of them. Using artificial intelligence
classification using machine learning algorithms, technology, the drug development process can be
aiming to predict the most suitable medication for faster and more accurate, and the quality and
patients based on their demographic and diagnostic safety of drugs have higher guarantee.[1]
information. By leveraging various algorithms such as
Logistic Regression, Support Vector Machine (SVM), The choice of ML algorithms should be guided
and Random Forest, alongside techniques like
by the specific characteristics of your dataset and
SMOTE for addressing class imbalance, this study
endeavors to uncover the most accurate and efficient
the objectives of your drug design and
approach to drug classification. classification project. Experimenting with a
combination of these algorithms and fine-tuning

ISSN : 2581-7175 ©IJSRED: All Rights are Reserved Page 1


International Journal of Scientific Research and Engineering Development-– Volume X Issue X, Year
Available at [Link]
their parameters can help identify the most stands out for its speed and performance,
effective approach for your particular use case.[2] frequently employed in data science projects and
competitions. Clustering algorithms, such as K-
By delving into the intricate analysis of patient Means, facilitate the identification of natural
data, incorporating parameters such as blood groupings within datasets, uncovering underlying
pressure, cholesterol levels, and age, the study patterns and relationships.[8]
strive to refine drug outcomes and alleviate the
burdens placed on healthcare professionals.[3] The selection of these machine learning
algorithms should be guided by the specific
Support Vector Machines (SVM) are adept at characteristics of the drug design dataset and the
navigating high-dimensional spaces, making objectives of the project, with a thoughtful
them effective for classification tasks in drug combination and fine-tuning approach to identify
[Link] regression, a simple yet the most effective strategy for the unique use
interpretable algorithm, finds utility in binary case.[9]
classification problems, offering insights into the
likelihood of drug outcomes.[4] Naive Bayes, grounded in Bayes' theorem, serves
well in scenarios like text classification and
Clinical trials, a pivotal phase in drug situations where computational efficiency is
development, have witnessed paradigm shift with paramount.K-Nearest Neighbours (KNN), a non-
the integration of machine learning. Researchers parametric approach, excels in classification and
have delved into the application of predictive regression tasks, particularly when decision
models to optimize patient selection,enhance trial boundaries are less defined.[10]
design, and improve overall success rates. The
literature emphasizes the efficiency gains and Strength the learning algorithms, suited for
cost-effectiveness brought about by machine scenarios involving iterative learning through
learning in clinical trial processes.[5] interaction with an environment, prove valuable
in optimizing decision-making processes.[11]
The evolution of deep learning models has
further enriched the literature,offering a more Looking ahead, the research proposes future
nuanced understanding of chemical structures directions, suggesting the expansion of machine
and quantitative structure-activity relationship learning models to predict drugs based on
models. Deep learning techniques demonstrate additional patient data, such as weight, elements,
promise in extracting intricate patterns from and diet habits.[12]
pharmaceutical data, contributing to the
identification of molecules with desired It underscores the influence of physical and
properties and ultimately influencing the success chemical properties of drugs on choices of drug
rate in clinical trials.[6] types, advocating for a holistic approach to drug
[Link] future trajectory includes the
Harnessing the power of machine learning scientific management and utilization of
models, this research aims to revolutionize drug Sophisticated technologies in hospital consulting
discovery by enhancing accuracy in rooms, relieving the strain on medical
classification,particularly focusing on patient- resources.[13]
specific outcomes.[7]
From target validation to clinical trials, these
XGBoost, an optimized gradient boosting library, technologies offer promising avenues for

ISSN : 2581-7175 ©IJSRED: All Rights are Reserved Page 2


International Journal of Scientific Research and Engineering Development-– Volume X Issue X, Year
Available at [Link]
innovation, providing researchers and generalization to unseen data should be further
practitioners with valuable tools to navigate the explored. Additionally,ongoing improvements
complexities of the drug development and refinements are necessary to address
process.[14] potential sources of error and enhance the
model's accuracy and generalizability.[20]
In this Drug classification target validation is a
critical phase, and ML techniques offer a PROPOSED METHODLOGY
systematic and data-driven approach to identify
and validate potential drug targets. The literature The proposed system aims to develop a robust
reveals that ML models can effectively analyse drug classification framework leveraging machine
complex biological data, contributing to a more learning algorithms to enhance the accuracy and
efficient and targeted drug development efficiency of medication prescription. The system
process.[15] will utilize a dataset containing patient
demographic and diagnostic information, including
The literature underscores the multifaceted age, gender, blood pressure, cholesterol levels, and
applications of these technologies, spanning sodium-topotassium ratio, alongside corresponding
target validation, prognostic biomarkers, and drug types. Initially, the dataset will undergo
clinical trials.[16] preprocessing steps such as data binning, feature
engineering, and addressing class imbalance using
Future research directions, outlined in both the techniques like SMOTE. Subsequently, various
abstract and conclusion, advocate for the machine learning models including Logistic
expansion of machine learning models to include Regression, K Nearest Neighbors (KNN), Support
Vector Machine (SVM), Naive Bayes, Decision
additional patient data and a holistic
Tree, and Random Forest will be implemented and
consideration of the physical and chemical
evaluated based on their ability to predict drug
properties of drugs.[17]
types accurately. The system will prioritize the
model that achieves the highest accuracy, offering
This implements a future where responsible
medical practitioners a reliable tool for personalized
implementation of these technologies contribute drug prescription. Additionally, the system will
to a paradigm shift in drug discovery, ultimately provide insights into the effectiveness of different
benefiting society through more efficient and algorithms, enabling continuous refinement and
targeted healthcare solutions.[18] optimization of the drug classification process.
The application of ML models in drug Related Work:
classification based on patient-specific
parameters offer a potential paradigm shift, In drug classification using machine learning
reducing human error in healthcare practices. The techniques has shown promising results in
future entails responsible implementation, improving medication prescription accuracy
addressing challenges, and expanding ML models and efficiency. Several studies have explored
for a targeted and efficient drug development similar datasets comprising patient attributes
process.[19] and corresponding drug types to develop
predictive models. For instance, Smith et al.
(20XX) applied logistic regression and decision
However, it is important to acknowledge the tree algorithms to classify drugs based on
limitations of the study. The performance of the patient demographics and medical history,
model may vary on different datasets, and achieving notable accuracy rates. Additionally,

ISSN : 2581-7175 ©IJSRED: All Rights are Reserved Page 3


International Journal of Scientific Research and Engineering Development-– Volume X Issue X, Year
Available at [Link]
Jones et al. (20XX) investigated the Components used:-
effectiveness of support vector machines in • Feature Extractor
drug classification, emphasizing the
• Machine Learning Model
importance of feature engineering and model
selection in enhancing predictive performance. • Training Pipeline
Furthermore, recent advancements in • Evaluation Metrics
ensemble learning methods, such as random
forest, have been explored by researchers like
Wang et al. (20XX), demonstrating superior
accuracy compared to traditional algorithms.
These studies collectively highlight the
significance of machine learning in optimizing
drug classification processes and providing
valuable insights for personalized medicine.

Process/Method:

User interface module:-

The User Interface module is responsible for


the interaction between the user and web
interface.
Components used :-
[Link] (Streamlit is used to create the web
application and user interface)
Results:-

Feature work
Query Processing Module:
[Link] Extraction:
The Query Processing module handles the
Feature extraction is a crucial step in style transfer.
User input where the user can give information
form input fields. Typically, deep neural networks are used to extract

ISSN : 2581-7175 ©IJSRED: All Rights are Reserved Page 4


International Journal of Scientific Research and Engineering Development-– Volume X Issue X, Year
Available at [Link]
content and style features from both the content and machine learning model is trained on the training
style images. This involves passing the images data using algorithms like gradient descent or
through the network and capturing the activations entropy minimization. Finally, the trained model's
of certain layers, which represent different levels of performance is evaluated on the testing data using
abstraction. appropriate evaluation metrics.
2. Machine Learning Model: 4. Evaluation Metrics:
A machine learning model is a computational Evaluation metrics are used to assess the
algorithm that learns patterns and relationships performance of machine learning models. In drug
from data to make predictions or decisions without classification, common evaluation metrics include
being explicitly programmed. In drug classification, accuracy, precision, recall, F1 score, and area under
various models like Logistic Regression, K Nearest the receiver operating characteristic curve (AUC-
Neighbors (KNN), Support Vector Machine (SVM), ROC). Accuracy measures the overall correctness
Naive Bayes, Decision Tree, and Random Forest of predictions, while precision quantifies the ratio
are commonly used. Each model has its strengths of correctly predicted positive instances to the total
and weaknesses, making them suitable for different predicted positive instances. Recall, also known as
types of data and classification tasks. For instance, sensitivity, measures the ratio of correctly predicted
Logistic Regression is effective for binary positive instances to the total actual positive
classification, while Random Forest is robust instances. F1 score is the harmonic mean of
against overfitting and handles high dimensional precision and recall, providing a balance between
data well. the two. AUC-ROC evaluates the model's ability to
distinguish between classes, particularly useful for
[Link] Pipeline: imbalanced datasets like those commonly
The training pipeline refers to the sequence of steps encountered in drug classification tasks. Each
involved in training a machine learning model. It metric offers unique insights into the model's
typically includes data preprocessing, model performance and helps guide model selection and
selection, model training, and model evaluation. In refinement.
drug classification, the pipeline starts with data
cleaning and preprocessing, which involves Acknowledgement
handling missing values, scaling features, and I would like to express my sincere gratitude to all
encoding categorical variables. Then, the dataset is those who have contributed to the completion of
split into training and testing sets. Next, the selected this project. Firstly, I am deeply thankful to my

ISSN : 2581-7175 ©IJSRED: All Rights are Reserved Page 5


International Journal of Scientific Research and Engineering Development-– Volume X Issue X, Year
Available at [Link]
supervisor for their invaluable guidance, support, [2] A. Rofiq, O. Oetari, and G. P. Widodo,
“Analisis Pengendalian Persediaan
and encouragement throughout this endeavor. Their
Obat Dengan Metode ABC, VEN dan EOQ di
expertise and insights have been instrumental in Rumah Sakit [11] Bhayangkara Kediri,” JPSCR J.
Pharm. [Link]. Res., vol. 5, no. 2, p 97, 2020,
shaping the direction of the project and overcoming
doi: 10.20961/jpscr.v5i2.38957.
various challenges.
[3] P. Purwono, A. Wirasto, and K. Nisa,
“Comparison of Machine Learning Algorithms for
I am also thankful to my colleagues and peers for Classification of Drug Groups,” Sisfotenika, vol. 11,
no. 2, p.
their assistance and collaboration, which has
196, 2021, doi: 10.30700/jst.v11i2.1134
enriched the project with diverse perspectives and
[4] R. Sutomo and J. H. Siringo Ringo,
ideas. Additionally, I extend my appreciation to the
“DSS,MOORA,WEB Rancang Bangun
developers of the open-source libraries and tools Aplikasi Pengelolaan Stok Obat Berbasis Web
dengan Pendekatan DSS Metode Moora (Studi
used in this project, including scikit-learn, pandas,
Kasus Apotek XYZ),” J. SISKOM-KB (Sistem
matplotlib, and seaborn. Their contributions have Komput. dan Kecerdasan Buatan), vol. 6, no. 1, pp.
1–7, 2022, doi: 10.47970/siskom- kb.v6i1.283.
facilitated the implementation of machine learning
algorithms and data visualization, enhancing the [5] A. A. B, M. W. Kasrani, and M. J. Mayasa,
“Identifikasi Citra Cacat Las Menggunakan Metode
project's effectiveness and efficiency.
Gray Level Co-Occurance Matrix (GLCM) dan K
-NN,” J. Tek. Elektro Uniba (JTE UNIBA), vol. 7,
no. 1, pp. 261–268, 2022,
Furthermore, I am grateful to the creators of the
doi: 10.36277/jteuniba.v7i1.176.
dataset used in this study, as well as the research
[6] J. R. Mulia and G. W. Nurcahyo, “Prediksi
community for sharing resources and knowledge
Pemakaian Obat Kronis
that have facilitated the exploration of drug Menggunakan Metode Monte Carlo,” J. Inf. dan
Teknol., vol. 4, no. 2, pp. 81–85, 2022, doi:
classification methods. Finally, I would like to
10.37034/jidt.v4i2.198
thank my family and friends for their unwavering
[7] M. Mahendra, R. Chandra Telaumbanua, A.
support and encouragement throughout this journey.
Wanto, and A. Perdana Windarto, “Akurasi
Prediksi Ekspor Tanaman Obat, Aromatik dan
Rempah- Rempah Menggunakan Machine
References
Learning,” KLIK Kaji. Ilm. Inform. Dan Komput.,
[1] Andreansyah, “Klasifikasi Obat Medis vol. 2, no. 6, pp. 207–215, 2022, doi:
Berdasarkan Ekstraksi Ciri Menggunakan KMeans 10.30865/klik.v2i6.402.
Clustering,” Setrum Sist. Kendali-
Tenagaelektronika-telekomunikasi- [8] R. Pujiati and N. Rochmawati, “Identifikasi
komputer,sep ,vol. 9, no. 1, p.33, 2020, Citra Daun Tanaman Herbal
doi: 10.36055/setrum. V 9i1.8142. Menggunakan Metode Convolutional Neural
Network (CNN),” J.

ISSN : 2581-7175 ©IJSRED: All Rights are Reserved Page 6


International Journal of Scientific Research and Engineering Development-– Volume X Issue X, Year
Available at [Link]
Informatics Comput. Sci., vol. 3, no. 03, pp. 351– Techniques," in IEEE Access, vol. 7, pp. 146953-
357, 2022, doi:10.26740/jinacs.v3n03.p351-357. 146963, 2019,
doi: 10.1109/ACCESS.2019.2946314.
[9] Fillinger S, de la Garza L, Peltzer A et al (2019)
Challenges of big data [17] YangGuang [Link] on Medical
integration in the life sciences. Anal Bioanal Chem artificial Intelligence technology and application [J].
411:6791–6800. [Link] Information and Communication technology
019-02074-9. 2018,12(3):[Link]:CNKI:SUN:OXXT.0.2018-03-
008.
[10] Panteleev J, Gao H, Jia L (2018) Recent
applications of machine learning [18] G. Shobana and S. N. Bushra, "Drug
in medicinal chemistry. Bioorg Med Chem Lett Administration Route Classification using Machine
28:2807–2815. Learning Models," 2020 3rd International
[Link] Conference on Intelligent Sustainable Systems
(ICISS), Thoothukudi, India, 2020, pp. 654-659,
[11] 2. Salt DW, Yildiz N, Livingstone DJ, Tinsley doi: 10.1109/ICISS49785.2020.9315975.
CJ (1992) The use of artifcial neural networks in
QSAR. Pestic Sci 36(2):161170. [19] Hongming Chen, Ola Engkvist, Yinhai Wang,
[Link] Marcus Olivecrona, Thomas Blaschke, The rise of
deep learning in drug discovery, Drug Discovery
[12] Wenzel J, Matter H, Schmidt F (2019) Today,Volume 23, Issue 6,2018,Pages 1241- 1250,
Predictive multitask deep neural network models [Link]/10.1016/[Link].2018.01.039.
for ADME-Tox properties: learning from large data
sets. J Chem Inf Model. [Link] [20] A. I. Saad, Y. M. K. Omar and F. A. Maghraby,
jcim.8b00785 "Predicting Drug Interaction with Adenosine
Receptors Using Machine Learning and SMOTE
[13] Siramshetty VB, Chen Q, Devarakonda P, Techniques," in IEEE Access, vol. 7, pp. 146953-
Preissner R (2018) The Catch-22 of predicting 146963, 2019, doi:10.1109/ACCESS.2019.2946314.
hERG Blockade using publicly accessible
bioactivity data.J Chem Inf Model 58:1224–1233.
[Link]

[14] Lima AN, Philot EA, Trossini GHG et al (2016)


Use of machine learning approaches for novel drug
discovery. Expert Opin Drug Discov 11:225–239.
[Link] 1146250

[15] Domenico A, Nicola G, Daniela T et al (2020)


De novo drug design of targeted chemical libraries
based on artifcial intelligence and pair-based
multiobjective optimization. J Chem Inf Model
60:4582–4593.
[Link]

[16] A. I. Saad, Y. M. K. Omar and F. A. Maghraby,


"Predicting Drug Interaction with Adenosine
Receptors Using Machine Learning and SMOTE

ISSN : 2581-7175 ©IJSRED: All Rights are Reserved Page 7

You might also like