0% found this document useful (0 votes)
21 views4 pages

Exploratory Data Analysis Assignment1

Uploaded by

sujithreddy765
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views4 pages

Exploratory Data Analysis Assignment1

Uploaded by

sujithreddy765
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Exploratory data analysis assignment

Global id: DOKKA1N

Task 1

Mean is 3.821958 which is less than median 4 so the data is leftward skewness.

By seeing the histogram we can see that the Delivery time is skewed to left and by seeing the
box plot we can see that there are a few outliers in our data
From the box plot we can also identify that Q1 is at 2 and Q3 is at 5 and maximum delivery
time is 25
Task 2
Here we are able to identify outliers for few of the sub catogaries such as Accessories, Art,
Binders, Machines, Paper, Phones, Tables , among all Fasteners have high median we don’t
see much difference in median among the remaining catogaries.
We also identify the inter quartile ranges of different catogaries there is no much difference
between the inter quartile ranges of the catogaries except for machines and labels.
Value of quartile 1 and median are same for Storage and Supplies.
Quartile1 values for all catogaries are same except Fasteners and Labels.

Task3
Machines have an outlier i.e. incuring 6500 loss and also Machines have a meadian which is
lower than remaining catogaries we also observe that copiers have the higher median
compared with remaining catogaries.
Most categories have median profits close to zero or slightly above.
Several categories show both positive and negative profit outliers, suggesting some sales
result in losses.
Task 4--- R code
# [Link]("DescTools")
# [Link]("ggoplot2")
# Load necessary libraries
library(DescTools)
library(ggplot2)
library(readxl)
Sales_orders <- read_excel("D:/fall_2024/narendar_581/Sales_orders.xlsx")
View(Sales_orders)
# ////////////////////////////////////////---- 1
# Finding Mean Median and Standers deviation for Delivery Time
mean(Sales_orders$'Delivery time')
median(Sales_orders$'Delivery time')
sd(Sales_orders$'Delivery time')
# Alternative way to find Mean and Median for Delivery Time
summary(Sales_orders$`Delivery time`)
# Histogram for Delivery Time
hist(Sales_orders$'Delivery time',
main = "Histogram of Delivery time",
xlab = "Delivery time",
ylab = "Frequency",
col = "blue",
border = "black",
breaks = 20)
# Boxplot for Delivery Time
boxplot(Sales_orders$"Delivery time",
main ="Boxplot of Delivery time",
xlab="Delivery time",
ylab="Values")
# //////////////////////////////////////////////---- 2
# Sub-Catogity wise box plot for Quantity
ggplot(Sales_orders,aes(x=Sales_orders$'Sub-Category',y=Sales_orders$Quantity)) +
geom_boxplot() +
labs(title = "Box Plot of Quantity by Category",
x = "Category",
y = "Quantity")
# /////////////////////////////////////////////----- 3
# Finding Mean Median and Standers deviation for Delivery Time
mean(Sales_orders$Profit)
median(Sales_orders$Profit)
sd(Sales_orders$Profit)
# Alternative way to find Mean and Median for Delivery Time
summary(Sales_orders$Profit)
# Sub-Catogity wise box plot for Profit
ggplot(Sales_orders,aes(x=Sales_orders$'Sub-Category',y=Sales_orders$Profit)) +
geom_boxplot() +
labs(title = "Box Plot of Profit by Category",
x = "Category",
y = "Profit")

You might also like