0% found this document useful (0 votes)

8 views8 pages

Chapter 5

Chapter 5 outlines the experimental setup for a machine learning project using Python and its libraries, particularly focusing on data science and analysis. It describes the dataset structure, feature extraction methods, and the implementation of various filters for model evaluation. The chapter also includes visual representations of data distributions and analyses to understand the impact of different variables on medical expenses.

Uploaded by

pankaj kumar singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views8 pages

Chapter 5

Uploaded by

pankaj kumar singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 8

CHAPTER 5

EXPERIMENT SETUP

5.1 Experimental Framework

Python is a prominent environment using by researcher to development or deployment

of generated systems. It has vast set of libraries with number of modules, packages that
supports programmer to attain in many ways to complete their work efficiently.

Figure 5.1: GUI Anaconda

Anaconda is a totally free Environment their source is really open to all for doing much.
Python and its libraries are using in data science and data analysis very efficiently. They
are also largely used for creating expandable machine learning algorithms. Python can
apply various machines learning techniques such65 as Classification, Regression,
Recommendation, and Clustering.

1
Python offers to researcher ready-to-Implement Environment for doing or performing
data mining tasks on huge volumes and variety of data effectively in lesser time.
Pandas

SciKit-Learn
Python Utility
SciPy

Matplotlib

Figure 5.2: Libraries of Python

5.2 Dataset & Features

Machine learning data is usually described in a matrix called dataset. This matrix is
structured in a way that corresponds to each row an observation (example) data set and
each column represents a feature (also variable or attribute) that describes the data. Data
values can take many representations. Data can be numerical (integer or real numbers) or
nominal data, where values are differentiated by name. Nominal data is type of
Categorical data type of that, as its name indicates, the data only can have a fixed set of
nominal values (or categories).

5.3 Implementation
The model employs filters for faster evaluation and lesser overall time. The pre-
processing methods and application of filters affect a lot in final evaluation results of
classifiers (ML based models). The feature extraction methods, conversion of nominal to
binary and cleaning are few of those filters

2
5.4 Different Process Stage

Figure 5.3: Calling Libraries

Explanation: In the figure 5.3 we called Libraries which will help you to call all
functionality which required.

Figure 5.4: Major Columns

Explanation: In the igure 5.4 we try to show number of major columns available in our
DataSet.we have 9 columns in our data set.

Figure 5.5: Major Columns

Explanation: In the igure 5.5 we try to show all attributes available in our DataSet.we
have 7 columns in our data set.

3
Description of Different Columns:

Figure 5.6: Descriptive Summary

Explanation: In the igure 5.6 we try to show age ,bmi,children and Expenses.

Distribution of age, bmi and expenses:

Figure 5.7: Distribution of smoker, children and region (Pia & Bar Graph)
Explanation: In the Figure 5.7 we Explained How Distribution gives clear Results :
1) We can say that we have an equal number of people of all ages
2.) Where maximum people have bmi around 30

4
3.)Finally Expenses are seem to be right skewed.(Learn Skewness from probablity and
statistical)

Note: Because Expenses are skewed in nature so that can be either transform this column
using log transformation or square root transformation. And this can be converted into
normal Distribution.

Distribution of age, bmi and expenses:

Figure 5.8: Distribution of smoker, children and region (Area Graph)

5
Bi Varient Analysis:

Figure 5.9: Bi-Variant Analysis

In Figure 5.9 Bivariant Analysis is one of the Simplest forms of quantitative analysis.
It involves analysis of two variables for determining the empirical relationship between
them.it can be helpful in testing simple hypothesis of association.

Impact of smoking and children in Medical Expenses:

Figure 5.10: Impact of smoking in Medical Expenses

6
In Figure 5.10 we explained How these columns play vital roles in Medical Expenses:

Number of children column as facet in the row side wheras Region column will be added
as a facet in the column side.
 If the age is less the Expense would also less.
 The Expenses of smoker in all regions ranges from 20 to 60k.
 Whereas the Expenses of Non-Smoker in all regions ranges from 10 to 20k
 The lesser range of Expenses is for lesser age people and vice versa.

Bubble Chart to represent the relation of expenses with different parameters

Figure 5.11: Bubble chart Analysis

In Figure 5.11 we explained How these columns play vital roles in Medical Expenses:
This chart makes it clear that BMI is not a powerful expense, as people having less BMI
also have high Medical Expenses.
 This chart makes it clear that people who smokes have higher Medical Expenses.
 The sise of Bubble, which represents age, shows that people having higher
expenses belongs to Higher Expenses categories.

7
Polar Graph Explanation Based Upon Region:

Figure 5.12: Polar Analysis

Figure 5.13: Region Wise Categorization

In Figure 5.12 & 5.13 we Explained How Region will also give impact of Medical
Expenses. We can say that in this explanation southeast region 's having max values.
Note: 3 regions expenses are very similar in terms of expenses.
8

Hypothesis Testing
No ratings yet
Hypothesis Testing
17 pages
Pracs
No ratings yet
Pracs
165 pages
Exploratory Data Analysis Main Concepts
No ratings yet
Exploratory Data Analysis Main Concepts
1 page
PROJECTS
No ratings yet
PROJECTS
6 pages
Bayesian Statistical Methods (Brian J. Reich, Sujit K. Ghosh)
No ratings yet
Bayesian Statistical Methods (Brian J. Reich, Sujit K. Ghosh)
288 pages
Statistics For Management and Economics, 9th Edition Keller ISM PDF
70% (10)
Statistics For Management and Economics, 9th Edition Keller ISM PDF
869 pages
1
No ratings yet
1
130 pages
Final Report Non Comp
No ratings yet
Final Report Non Comp
14 pages
Lme4: Mixed-Effects Modeling With R
No ratings yet
Lme4: Mixed-Effects Modeling With R
145 pages
GUIDE User Manual 26.0 Department of Statistics Wisconsin-Madison
No ratings yet
GUIDE User Manual 26.0 Department of Statistics Wisconsin-Madison
247 pages
Clinical Data Analysis in Healthcare
100% (1)
Clinical Data Analysis in Healthcare
44 pages
Datos Categóricos
No ratings yet
Datos Categóricos
416 pages
Workshop Handout
No ratings yet
Workshop Handout
67 pages
(Edward Curry) An Introduction To Bioinformatics - A Practical Guide For Biologists
No ratings yet
(Edward Curry) An Introduction To Bioinformatics - A Practical Guide For Biologists
248 pages
Using R For Bayesian Spatial and Spatio Temporal Health Modeling - 1st Edition High-Resolution PDF Download
100% (1)
Using R For Bayesian Spatial and Spatio Temporal Health Modeling - 1st Edition High-Resolution PDF Download
16 pages
Bayesian Thinking in Biostatistics - 1st Edition PDF Ebook With Full Chapters
No ratings yet
Bayesian Thinking in Biostatistics - 1st Edition PDF Ebook With Full Chapters
15 pages
Using R For Bayesian Spatial and Spatio Temporal Health Modeling - 1st Edition Scribd PDF Download
100% (12)
Using R For Bayesian Spatial and Spatio Temporal Health Modeling - 1st Edition Scribd PDF Download
16 pages
Categorical Data Analysis With Graphics
No ratings yet
Categorical Data Analysis With Graphics
104 pages
Predictive Business Analysis Notes
No ratings yet
Predictive Business Analysis Notes
5 pages
Discernibility and Rough Sets
No ratings yet
Discernibility and Rough Sets
239 pages
Introduction To Bayesian Methods in Ecology and Natural Resources Exclusive Download
100% (12)
Introduction To Bayesian Methods in Ecology and Natural Resources Exclusive Download
15 pages
S 15 Notes
No ratings yet
S 15 Notes
216 pages
The R Primer 2nd Edition Claus Thorn Ekstrøm Download Full Chapters
No ratings yet
The R Primer 2nd Edition Claus Thorn Ekstrøm Download Full Chapters
170 pages
KNN and Baysian Method
No ratings yet
KNN and Baysian Method
43 pages
Class 14 - Basic Coding in Python - 5
No ratings yet
Class 14 - Basic Coding in Python - 5
24 pages
Bayesian Statistical Methods
100% (11)
Bayesian Statistical Methods
288 pages
Medical
No ratings yet
Medical
4 pages
Stata Intro & Lecture Notes
No ratings yet
Stata Intro & Lecture Notes
48 pages
TYCS Practical
No ratings yet
TYCS Practical
26 pages
LectureNotes22 WI4455
No ratings yet
LectureNotes22 WI4455
154 pages
Ai Record Programs
No ratings yet
Ai Record Programs
34 pages
Estadistica Medica Con R
No ratings yet
Estadistica Medica Con R
73 pages
Stata Guide - David Veenman v5.0 (Sept 2023)
No ratings yet
Stata Guide - David Veenman v5.0 (Sept 2023)
233 pages
Statistics Easy Scientific Datasets
100% (1)
Statistics Easy Scientific Datasets
76 pages
Machine Learning Ess - Week 1-4week
No ratings yet
Machine Learning Ess - Week 1-4week
43 pages
Count Data Regression for Medical Diagnosis
No ratings yet
Count Data Regression for Medical Diagnosis
26 pages
Solution Manual Adms 2320 PDF
No ratings yet
Solution Manual Adms 2320 PDF
869 pages
Andrew B Lawson - Using R For Bayesian Spatial and Spatio-Temporal Health Modeling-CRC Press (2021)
No ratings yet
Andrew B Lawson - Using R For Bayesian Spatial and Spatio-Temporal Health Modeling-CRC Press (2021)
300 pages
IoT in Hospital Management
No ratings yet
IoT in Hospital Management
7 pages
Akritas Probability & Statistics With R For Engineers and Scientists
No ratings yet
Akritas Probability & Statistics With R For Engineers and Scientists
256 pages
Biological Data Science Lecture7
No ratings yet
Biological Data Science Lecture7
17 pages
An Introduction To Statistics With Python With Applications in The Life Sciences Unlimited Ebook Download
No ratings yet
An Introduction To Statistics With Python With Applications in The Life Sciences Unlimited Ebook Download
14 pages
COVID-19 Data Analysis Report
No ratings yet
COVID-19 Data Analysis Report
21 pages
Think Stats: Probability and Statistics For Programmers
100% (1)
Think Stats: Probability and Statistics For Programmers
142 pages
An Introduction To Statistics With Python With Applications in The Life Sciences Research PDF Download
100% (18)
An Introduction To Statistics With Python With Applications in The Life Sciences Research PDF Download
14 pages
Statistical Modelling For Biomedical Researchers
100% (2)
Statistical Modelling For Biomedical Researchers
544 pages
Systat
No ratings yet
Systat
8 pages
Statistical Methods For Data Science
100% (2)
Statistical Methods For Data Science
406 pages
An Introduction To Modern Bayesian Econometrics: Tony Lancaster May 26, 2003
No ratings yet
An Introduction To Modern Bayesian Econometrics: Tony Lancaster May 26, 2003
10 pages
Probability and Statistics: Cookbook
No ratings yet
Probability and Statistics: Cookbook
31 pages
STATA Logistic Regression Guide
No ratings yet
STATA Logistic Regression Guide
4 pages
(Springer Series in Statistics) Anthony C. Atkinson, Marco Riani, Andrea Cerioli (Auth.) - Exploring Multivariate Data With The Forward Search-Springer-Verlag New York (2004)
No ratings yet
(Springer Series in Statistics) Anthony C. Atkinson, Marco Riani, Andrea Cerioli (Auth.) - Exploring Multivariate Data With The Forward Search-Springer-Verlag New York (2004)
642 pages
Lme4: Mixed-Effects Modeling With R: Douglas M. Bates 2005-2012
No ratings yet
Lme4: Mixed-Effects Modeling With R: Douglas M. Bates 2005-2012
143 pages
Probability and Statistics: Cookbook
No ratings yet
Probability and Statistics: Cookbook
31 pages
Think Stats
100% (2)
Think Stats
142 pages
Machine Learning in Obesity Research
No ratings yet
Machine Learning in Obesity Research
18 pages
Medical Statistics With R
No ratings yet
Medical Statistics With R
85 pages
Lecture 01
No ratings yet
Lecture 01
23 pages
Artificial Intelligence in Risk Management Within The Realm of Construction
No ratings yet
Artificial Intelligence in Risk Management Within The Realm of Construction
25 pages
Final Poster Presentation
No ratings yet
Final Poster Presentation
1 page
Project Documentation - AI-Powered Phishing Detector
No ratings yet
Project Documentation - AI-Powered Phishing Detector
4 pages
A Review of Visual Trackers and Analysis of Its Application To Mobile Robot
No ratings yet
A Review of Visual Trackers and Analysis of Its Application To Mobile Robot
25 pages
Report On Multiple Disease Prediction Using Machine Learning Algorithms
No ratings yet
Report On Multiple Disease Prediction Using Machine Learning Algorithms
14 pages
Machine Learning Overview and Techniques
No ratings yet
Machine Learning Overview and Techniques
62 pages
Introduction to Big Data Concepts
No ratings yet
Introduction to Big Data Concepts
58 pages
2025 - NN - Business Intelligence and Data Analysis in The Age of AI - Khan
No ratings yet
2025 - NN - Business Intelligence and Data Analysis in The Age of AI - Khan
476 pages
1 s2.0 S0301679X24006121 Main
No ratings yet
1 s2.0 S0301679X24006121 Main
15 pages
KNN Classification with Scaling
No ratings yet
KNN Classification with Scaling
4 pages
Guide to Computer Vision Resources
100% (2)
Guide to Computer Vision Resources
17 pages
Application of Explainable AI For Diagnosis of Coronary Heart Disease
No ratings yet
Application of Explainable AI For Diagnosis of Coronary Heart Disease
8 pages
300+ TOP DATA MINING Multiple Choice Questions and Answers
No ratings yet
300+ TOP DATA MINING Multiple Choice Questions and Answers
10 pages
Deep MIMIO Data Set
No ratings yet
Deep MIMIO Data Set
8 pages
RAMAN Reinforcement Learning Inspired Algorithm For Mapping Applications Onto Mesh Network-on-Chip
No ratings yet
RAMAN Reinforcement Learning Inspired Algorithm For Mapping Applications Onto Mesh Network-on-Chip
7 pages
Graham Denoising Diffusion Models For Out-Of-Distribution Detection CVPRW 2023 Paper
No ratings yet
Graham Denoising Diffusion Models For Out-Of-Distribution Detection CVPRW 2023 Paper
10 pages
Jay Internship Report
No ratings yet
Jay Internship Report
46 pages
ML BIT Ans
No ratings yet
ML BIT Ans
5 pages
DIP Mini Project
100% (1)
DIP Mini Project
12 pages
Holistic Configuration Management at Facebook
No ratings yet
Holistic Configuration Management at Facebook
16 pages
Probability Drinking Water (Water Quality Index) Industrial Training
No ratings yet
Probability Drinking Water (Water Quality Index) Industrial Training
32 pages
Gen AI & ML
No ratings yet
Gen AI & ML
41 pages
Development of Heart Disesase Prediction System Using Firefly Feature Selection and Logistic Regression Algorithm (Tobless)
No ratings yet
Development of Heart Disesase Prediction System Using Firefly Feature Selection and Logistic Regression Algorithm (Tobless)
42 pages
Basics of ANN
No ratings yet
Basics of ANN
16 pages
Deep Learning Imaging Assignment
No ratings yet
Deep Learning Imaging Assignment
3 pages
Deepeye Document
No ratings yet
Deepeye Document
53 pages
Lecture 12 - Decision and Regression Trees
No ratings yet
Lecture 12 - Decision and Regression Trees
35 pages
Outlier Detection Techniques
100% (1)
Outlier Detection Techniques
13 pages
Varshith - CV02 (3) (2) - 1
No ratings yet
Varshith - CV02 (3) (2) - 1
1 page

Chapter 5

Uploaded by

Chapter 5

Uploaded by

CHAPTER 5

5.1 Experimental Framework

Python is a prominent environment using by researcher to development or deployment

Figure 5.1: GUI Anaconda

Figure 5.2: Libraries of Python

5.2 Dataset & Features

Figure 5.3: Calling Libraries

Figure 5.4: Major Columns

Figure 5.5: Major Columns

Figure 5.6: Descriptive Summary

Distribution of age, bmi and expenses:

Distribution of age, bmi and expenses:

Figure 5.8: Distribution of smoker, children and region (Area Graph)

Figure 5.9: Bi-Variant Analysis

Impact of smoking and children in Medical Expenses:

Figure 5.10: Impact of smoking in Medical Expenses

Bubble Chart to represent the relation of expenses with different parameters

Figure 5.11: Bubble chart Analysis

Figure 5.12: Polar Analysis

Figure 5.13: Region Wise Categorization

You might also like