Exploratory Data Analysis

The document provides a comprehensive five-step guide for conducting exploratory data analysis (EDA), emphasizing its importance in data science for uncovering patterns and informing business decisions. It outlines key steps including understanding the dataset, checking for missing data, providing basic descriptions of features, identifying data shape, and recognizing significant correlations. The guide is aimed at both new and seasoned data scientists, offering practical tips and methodologies for effective EDA.

Uploaded by

abdo adnan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

0% found this document useful (0 votes)

24 views19 pages

Exploratory Data Analysis

Uploaded by

abdo adnan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

You are on page 1/ 19

Shopify Engineering Show ~ Latest articlesDevelopmentinfrastructureMobileDeveloper ToolingSecurityData Science & EngineeringCulture A Five-Step Guide for Conducting Exploratory Data Analysis by Cody Mazza-Anthony Data Science & Engineering Apr 28,2021 16 minute read Have you ever been handed a dataset and then been asked to describe it? When | was first starting out in data science, this question confused me. My first thought was “What do you mean?” followed by “Can you be more specific?” The reality is that exploratory data analysis (EDA) is a critical tool in every data scientist's kit, and the results are invaluable for answering important business questions. Simply put, an EDA refers to performing visualizations and identifying significant patterns, such as correlated features, missing data, and outliers. EDA's are also essential for providing hypotheses for why these patterns occur. It most likelywon't appear in your data product, data highlight, or dashboard, but it will help to inform all of these things. Below, I'll walk you through key tips for performing an effective EDA. For the more seasoned data scientists, the contents of this post may not come as a surprise (rather a good reminder!), but for the new data scientists, I'll provide a potential framework that can help you get started on your journey. We'll use a synthetic dataset for illustrative purposes. The data below does not reflect actual data from Shopify merchants. As we go step-by-step, | encourage you to take notes as you progress through each section! Before You Start Before you start exploring the data, you should try to understand the data at a high level. Speak to leadership and product to try to gain as much context as possible to help inform where to focus your efforts. Are you interested in performing a prediction task? Is the task purely for exploratory purposes? Depending on the intended outcome, you might point out very different things in your EDA. With that context, it's now time to look at your dataset. It’s important to identify how many samples (rows) and how many features (columns) are in your dataset. The size of your data helps inform any computational bottlenecks that may occur down the road. For instance, computing a correlation matrix on large datasets can take quite a bit of time. If your dataset is too big to work within a Jupyter notebook, | suggest subsampling so you have something that represents your data, but isn’t too big to work with. The first 5 rows of our synthetic dataset. The dataset above does not reflect actual data from Shopify merchants,Once you have your data in a suitable working environment, it’s usually a good idea to look at the first couple rows. The above image shows an example dataset we can use for our EDA. This dataset is used to analyze merchant behaviour. Here are a few details about the features: * Shop Cohort: the month and year a merchant joined Shopify + GMV (Gross Merchandise Volume): total value of merchandise sold. * AOV (Average Order Value): the average value of customers’ orders since their first order. * Conversion Rate: the percentage of sessions that resulted in a purchase * Sessions: the total number of sessions on your online store. * Fulfilled Orders: the number of orders that have been packaged and shipped. * Delivered Orders: the number of orders that have been received by the customer. One question to address is “What is the unique identifier of each row in the data?” A unique identifier can be a column or set of columns that is guaranteed to be unique across rows in your dataset. This is key for distinguishing rows and referencing them in our EDA. Now, if you've been taking notes, here's what they may look like so far: * The dataset is about merchant behaviour. It consists of historic information about a set of merchants collected on a daily basis * The dataset contains 1500 samples and 13 features. This is a reasonable size and will allow me to work in Jupyter notebooks * Each row of my data is uniquely identified by the “Snapshot Date” and “Shop ID” columns. In other words, my data contains one row per shop per day * There are 100 unique Shop IDs and 15 unique Snapshot Dates * Snapshot Dates range from ‘2020-01-01’ to ‘2020-01-15’ for a total of 15 days 1. Check For Missing DataNow that we've decided how we're going to work with the data, we begin to look at the data itself. Checking your data for missing values is usually a good place to start. For this analysis, and future analysis, | suggest analyzing features one at a time and ranking them with respect to your specific analysis. For example, if we look at the below missing values analysis, we'd simply count the number of missing values for each feature, and then rank the features by largest amount of missing values to smallest. This is especially useful if there are a large amount of features. 2 2 a. a. 2 ° pts cree red Ore Cameaentine Sap itat Pett Sa cary Sexcuere Feature Names Feature ranking by missing value counts Let’s look at an example of something that might occur in your data. Suppose a feature has 70 percent of its values missing. As a result of such a high amount of missing data, some may suggest to just remove this feature entirely from the data. However, before we do anything, we try to understand what this feature represents and why we're seeing this behaviour. After further analysis, we may discover that this feature represents a response to a survey question, and in most cases it was left blank. A possible hypothesis is that a large proportion of the population didn't feel comfortable providing an answer. If we simply remove this feature, we introduce bias into our data. Therefore, this missing data is a feature in its own right and should be treated as such. Now for each feature, | suggest trying to understand why the data is missing and what it can mean. Unfortunately, this isn’t always so simple and an answer might not exist. That's why an entire area of statistics, Imputation, is devoted to this problem and offers several solutions. What approach you choose depends entirely on the type of data. For time series data without seasonality or trend, you can replace missing values with the mean or median. If the time series does contain a trend but not seasonality, then you can apply a linear interpolation. If it contains both, then you should adjust for the seasonality and then apply a linearinterpolation, In the survey example I discussed above, | handle missing values by creating a new category “Not Answered” for the survey question feature. | won't go into detail about all the various methods here, however, | suggest reading How to Handle Missing Data for more details on Imputation. Great! We've now identified the missing values in our data—let’s update our summary notes: * 10 features contain missing values * “Fulfilled Orders” contains the most missing values at 8% and “Shop Currency” contains the least at 6% 2. Provide Basic Descriptions of Your Sample and Features At this point in our EDA, we've identified features with missing values, but we still know very little about our data. So let’s try to fill in some of the blanks. Let's categorize our features as either: Continuous: A feature that is continuous can assume an infinite number of values in a given range. An example of a continuous feature is a merchant's Gross. Merchandise Value (GMV). Discrete: A feature that is discrete can assume a countable number of values and is always numeric. An example of a discrete feature is a merchant's Sessions. Categorical: A feature that is discrete can only assume a finite number of values. An example of a discrete feature is a merchant's Shopify plan type. The goal is to classify all your features into one of these three categories. GMV AOV Conversion Rate62894.85 628.95 17 NaN 1390.86 0.07 25890.06 258.90 0.04 6446.36 64.46 0.20 47432.44 258.90 0.10 Example of continuous features Sessions Products Sold Fulfilled Orders un 52 108 9 46 96 182 47 98 147 44 99 45 65 125 Example of discrete features Plan Country Shop Cohort Plus UK Nan Advanced Shopify UK 2019-04 Plus Canada 2019-04Advanced Shopify Canada 2019-06 Advanced Shopify UK 2019-06 Example of categorical features You might be asking yourself, how does classifying features help us? This categorization helps us decide what visualizations to choose in our EDA, and what statistical methods we can apply to this data. Some visualizations won't work on all continuous, discrete, and categorical features. This means we have to treat groups of each type of feature differently. We will see how this works in later sections. Let's focus on continuous features first. Record any characteristics that you think are important, such as the maximum and minimum values for that feature. Do the same thing for discrete features. For categorical features, some things | like to check for are the number of unique values and the number of occurrences of each unique value. Let's add our findings to our summary notes: There are 3 continuous features, 4 discrete features, and 4 categorical features * GMV: * Continuous Feature * Values are between $12.07 and $814468.03 * Data is missing for one day... should check to make sure this isn't a data collection error” Plan: * Categorical Feature * Assumes four values: “Basic Shopit} “Plus”. ‘Shopify”, “Advanced Shopify”, and * The value counts of “Basic Shopify”, “Shopify”, “Advanced Shopify”, and “Plus” are 255, 420, 450, and 375, respectively * There seems to be more merchants on the “Advanced Shopify” plan than on the “Basic” plan. Does this make sense?3. Identify The Shape of Your Data The shape of a feature can tell you so much about it. What do | mean by shape? I'm referring to the distribution of your data, and how it can change over time. Let's plot a few features from our dataset: GMV and Sessions behaviour across samples 20018) sone If the dataset is a time series, then we investigate how the feature changes over time. Perhaps there's a seasonality to the feature or a positive/negative linear trend over time. These are all important things to consider in your EDA. In the graphs above, we can see that AOV and Sessions have positive linear trends and Sessions emits a seasonality (a distinct behaviour that occurs in intervals). Recall that Snapshot Data and Shop ID uniquely define our data, so the seasonality we observe can be due to particular shops having more sessions than other shops in the data. In the line graph below, we see that the Sessions seasonality was a result of two specific shops: Shop 1 and Shop 51. Perhaps these shops have a higher GMV or AOV?Next, you'll calculate the mean and variance of each feature. Does the feature hardly change at all? Is it constantly changing? Try to hypothesize about the behaviour you see. A feature that has a very low or very high variance may require additional investigation. Probability Density Functions (PDFs) and Probability Mass Functions (PMFs) are your friends. To understand the shape of your features, PMFs are used for discrete features and PDFs for continuous features. : “| ‘ov Sessions Example of feature density functions Here a few things that PMFs and PDFs can tell you about your data: * Skewness * Is the feature heterogeneous (multimodal)?If the PDF has a gap in it, the feature may be disconnected. \s it bounded? We can see from the example feature density functions above that all three features are skewed. Skewness measures the asymmetry of your data. This might deter us from using the mean as a measure of central tendency. The median is more robust, but it comes with additional computational cost. Overall, there are a lot of things you can consider when visualizing the distribution of your data. For more great ideas, | recommend reading Exploratory Data Analysis. Don’t forget to update your summary notes: AOV: * Continuous Feature * Values are between $38.69 and $8994.58 * 8% of values missing * Observe a large skewness in the data. * Observe a positive linear trend across samples Sessions: * Discrete Feature * Contains count data (can assume non-negative integer data) * Values are between 2 and 2257 * 7.7% of values missing * Observe a large skewness in the data * Observe a positive linear trend across samples * Shops 1 and 51 have larger daily sessions and show a more rapid growth of sessions compared to other shops in the data Conversion Rate: * Continuous Feature * Values are between 0.0 and 0.86 * 77% of values missing* Observe a large skewness in the data 4. Identify Significant Correlations Correlation measures the relationship between two variable quantities. Suppose we focus on the correlation between two continuous features: Delivered Orders and Fulfilled Orders. The easiest way to visualize correlation is by plotting a scatter plot with Delivered Orders on the y axis and Fulfilled Orders on the x axis. As expected, there's a positive relationship between these two features. 140 = Delivered Orders Ss 2 8 @ 8 8 8 8 8 70 8) =6©90) 6100 10 120 130 140 Fulfilled Orders Scatter plot showing positive correlation between features "Delivered Orders” and “Fulfilled Orders” If you have a high number of features in your dataset, then you can't create this plot for all of them—it takes too long. So, | recommend computing the Pearson correlation matrix for your dataset. It measures the linear correlation between features in your dataset and assigns a value between -1 and 1 to each feature pair. A positive value indicates a positive relationship and a negative value indicates a negative relationship.cmv 002 030 002 000 001 0.02 010 001 000 001 ADS 075 Conversion Rate 0.02 0.02 002 002 «002 0.01 0.50 ~025 Sessions 030 010 0.02 = | 0.00 002 0.03 -000 --0.25 Products Sold 0.02 0.01 002 0.00 B 003 005 -050 075 Fulfled Orders 0.00 0,00 002-002 0.03 Aa Delivered Orders 0.01 0.01 001 003 005 Bm Feature My Aov Sessions Products Sold Fulfled Orders Conversion Rate Delivered Orders Feature Correlation matrix for continuous and discrete features It’s important to take note of all significant correlation between features. It’s possible that you might observe many relationships between features in your dataset, but you might also observe very little. Every dataset is different! Try to form hypotheses around why features might be correlated with each other. In the correlation matrix above, we see a few interesting things. First of all, we see a large positive correlation between Fulfilled Orders and Delivered Orders, and between GMV and AOV. We also see a slight positive correlation between Sessions, GMV, and AOV. These are significant patterns that you should take note of. Since our data is a time series, we also look at the autocorrelation of shops. Computing the autocorrelation reveals relationships between a signal's current value and its previous values. For instance, there could be a positive correlation between a shop's GMV today and its GMV from 7 days ago. In other words,customers like to shop more on Saturdays compared to Mondays. I won't go into any more detail here since Time Series Analysis is a very large area of statistics, but | suggest reading A Very Short Course on Time Series Analysis. Shop | 2019- | 2019- | 2019- | 2019- | 2019- | 2019- Cohort | o1 02 03 04 05 06 Plan Adv. | 7 102 | 27 nm 87 73 Shopify Basic 45 44 42 59 0 87 Shopity Plus 29 55 69 44 nm 86 Shopify | 53 72 108 | 72 60 28 Contingency table for discrete features “Shop Cohort” an “Plan” The methodology outlined above only applies to continuous and discrete features. There are a few ways to compute correlation between categorical features, however, I'll only discuss one, the Pearson chi-square test. This involves taking pairs of discrete features and computing their contingency table. Each cell in the contingency table shows the frequency of observations. In the Pearson chi-square test, the null hypothesis is that the categorical variables in question are independent and, therefore, not related. In the table above, we compute the contingency table for two categorical features from our dataset: Shop Cohort and Plan. After that, we perform a hypothesis test using the chi-square distribution with a specified alpha level of significance. We then determine whether the categorical features are independent or dependent. 5. Spot Outliers in the DatasetLast, but certainly not least, spotting outliers in your dataset is a crucial step in EDA. Outliers are significantly different from other samples in your dataset and can lead to major problems when performing statistical tasks following your EDA. There are many reasons why an outlier might occur. Perhaps there was a measurement error for that sample and feature, but in many cases outliers occur naturally. Continuous feature box plots The box plot visualization is extremely useful for identifying outliers. In the above figure, we observe that all features contain quite a few outliers because we see data points that are distant from the majority of the data. There are many ways to identify outliers. It doesn’t make sense to plot all of our features one at a time to spot outliers, so how do we systematically accomplish this? One way is to compute the ist and 99th percentile for each feature, then classify data points that are greater than the 99th percentile or less than the Ist percentile. For each feature, count the number of outliers, then rank them from most outliers to least outliers. Focus on the features that have the most outliers and try to understand why that might be. Take note of your findings. Unfortunately, the aforementioned approach doesn’t work for discrete features since there needs to be an ordering to compute percentiles. An outlier can mean many things. Suppose our discrete feature can assume one of three values: apple, orange, or pear. For 99 percent of samples, the value is either apple or orange, and only 1 percent for pear. This is one way we might classify an outlier for this feature. For more advanced methods on detecting anomalies in categorical data, check out Outlier Analysis.What's Next? We're at the finish line and completed our EDA. Let's review the main takeaways: * Missing values can plague your data. Make sure to understand why they are there and how you plan to deal with them. * Provide a basic description of your features and categorize them. This will drastically change the visualizations you use and the statistical methods you apply. * Understand your data by visualizing its distribution. You never know what you will find! Get comfortable with how your data changes across samples and over time. * Your features have relationships! Make note of them. These relationships can come in handy down the road. * Outliers can dampen your fun only if you don’t know about them. Make known, the unknowns! But what do we do next? Well that all depends on the problem. It's useful to present a summary of your findings to leadership and product. By performing an EDA, you might answer some of those crucial business questions we alluded to earlier. Going forward, does your team want to perform a regression or classification on the dataset? Do they want it to power a KPI dashboard? So many wonderful options and opportunities to explore! It’s important to note that an EDA is a very large area of focus. The steps | suggest are by no means exhaustive and if | had the time I could write a book on the subject! In this post, | share some of the most common approaches, but there's so much more that can be added to your own EDA. If you're passionate about data at scale, and you're eager to learn more, we're hiring! Reach out to us or apply on our careers page. Cody Mazza-Anthony is a Data Scientist on the Insights team. He is currently working on building a Product Analytics framework for Insights.Cody enjoys building intelligent systems to enable merchants to grow their business. If you'd like to connect with Cody, reach out on Linkedin. Wherever you are, your next journey starts here! If building systems from the ground up to solve real-world problems interests you, our Engineering blog has stories about other challenges we have encountered. Intrigued? Visit our Data Science & Engineering career page to find out about our open positions. Learn about how we'te hiring to design the future together—a future that is Digital by Design. Get stories like this in your inbox! Stories from the teams who build and scale Shopify. The commerce platform powering millions of businesses w: Email address Yes, sign me up! Share your email with us and receive monthly ups wide. C Search articles LY Get stories like this in your inbox!Stories from the teams who build and scale Shopify. The commerce platform powering millions of businesses worldwide. Email address Yes, sign me up! Share your email with us and receive monthly up (D RESOURCES Our Tech Stack Curious about what's in our tech stack. Sponsorship We're looking to partner with you Working Anywhere at Shopify Learn about Digital by Design Shopify Partner Developers Become a Shopify developer and earn money by building apps or working with businesses Shopify Engineering on Twitter Connect with us on Twitter Shopify Engineering YouTube Connect with us on YouTube {) POPULAR Migrating our Largest Mobile App to React Native Ruby 3.2's YJIT is Production-Ready From Ruby to Node: Overhauling Shopify’s CLI for a Better Developer Experience The 25 Percent Rule for Tackling Technical Debt The Case Against Monkey Patching, From a Rails Core Team Member 10 Tips for Building Resilient Payment SystemsHow Good Documentation Can Improve Productivity Five Common Data Stores and When to Use Them Deconstructing the Monolith: Designing Software that Maximizes Developer Productivity @ LATEST Improving Shopify App’s Performance A Packwerk Retrospective Horizontally scaling the Rails backend of Shop app with Vitess Getting Started with React Native Skia Introducing Ruvy Building a ShopifyQL Code Editor Sidekick’s Improved Streaming Experience Shopify’s platform is the Web platform Contributing support for a Wasm instruction to Winch Creating a Flexible Order Routing System with Shopify Functions Ready to tackle frontend, backend, infrastructure, data, or security challenges?Explore all of our available roles >

Data Analysis Guide for Beginners
No ratings yet
Data Analysis Guide for Beginners
26 pages
Eda Sandhya
No ratings yet
Eda Sandhya
7 pages
Unit 1
No ratings yet
Unit 1
23 pages
Intro
No ratings yet
Intro
26 pages
Unit 3-BA
No ratings yet
Unit 3-BA
31 pages
Unit 1 DXV
No ratings yet
Unit 1 DXV
28 pages
INDEX
No ratings yet
INDEX
16 pages
Exploratory Data Analysis
No ratings yet
Exploratory Data Analysis
17 pages
Exploratory Data Analysis Gam
No ratings yet
Exploratory Data Analysis Gam
10 pages
Lesson 5 Exploratory Data Analysis
No ratings yet
Lesson 5 Exploratory Data Analysis
10 pages
UNIT 1 Exploratory Data Analysis
100% (1)
UNIT 1 Exploratory Data Analysis
8 pages
EDA Report Week2
No ratings yet
EDA Report Week2
15 pages
Unit3 Eda
No ratings yet
Unit3 Eda
13 pages
ML Exp No 1
No ratings yet
ML Exp No 1
8 pages
Exploratory Data Analysis-1
No ratings yet
Exploratory Data Analysis-1
10 pages
Complete Guide To Exploratory Data Analysis With Python Plotly - by Anar Abiyev - Mar, 2022 - Medium
No ratings yet
Complete Guide To Exploratory Data Analysis With Python Plotly - by Anar Abiyev - Mar, 2022 - Medium
11 pages
Unit 4
No ratings yet
Unit 4
33 pages
Guidebook On Exploratory Data Analysis
No ratings yet
Guidebook On Exploratory Data Analysis
27 pages
Lab07ML - f40
No ratings yet
Lab07ML - f40
13 pages
Semi-Automated EDA in Python
No ratings yet
Semi-Automated EDA in Python
3 pages
Data Cleaning and EDA Techniques Guide
No ratings yet
Data Cleaning and EDA Techniques Guide
38 pages
EDA Basics: Python for Data Analysis
100% (1)
EDA Basics: Python for Data Analysis
30 pages
Types of Data Attributes Explained
No ratings yet
Types of Data Attributes Explained
11 pages
Document
No ratings yet
Document
21 pages
Systematic Approach To Perform Task Centric Exploratory Data Analysis With Case Study
No ratings yet
Systematic Approach To Perform Task Centric Exploratory Data Analysis With Case Study
8 pages
Exp 12
No ratings yet
Exp 12
7 pages
Data Analytics Course for Beginners
No ratings yet
Data Analytics Course for Beginners
34 pages
EDA: Insights for Machine Learning
No ratings yet
EDA: Insights for Machine Learning
10 pages
Introduction To Exploratory Data Analysis EDA
No ratings yet
Introduction To Exploratory Data Analysis EDA
10 pages
Part 7
No ratings yet
Part 7
26 pages
Data Science & EDA Essentials
No ratings yet
Data Science & EDA Essentials
151 pages
EDA Techniques in SAS for Data Science
No ratings yet
EDA Techniques in SAS for Data Science
25 pages
Analytical Decision Making
No ratings yet
Analytical Decision Making
27 pages
Explorato Ry: Data Analysis
No ratings yet
Explorato Ry: Data Analysis
6 pages
Exploratory Data Analysis EDA Part of Data PreProcessing
No ratings yet
Exploratory Data Analysis EDA Part of Data PreProcessing
11 pages
Exploratory Data Analysis
No ratings yet
Exploratory Data Analysis
23 pages
EDA - Task
No ratings yet
EDA - Task
20 pages
Eda Indepth
No ratings yet
Eda Indepth
19 pages
Eda 1
No ratings yet
Eda 1
25 pages
Unit 1 - Intro To EDA
No ratings yet
Unit 1 - Intro To EDA
40 pages
BI-LEc 3
No ratings yet
BI-LEc 3
24 pages
Exp 12
No ratings yet
Exp 12
4 pages
IOT-Domain Analyst
No ratings yet
IOT-Domain Analyst
11 pages
Essential Guide to Exploratory Data Analysis
No ratings yet
Essential Guide to Exploratory Data Analysis
3 pages
P23MBA547 Predictive Analytics
No ratings yet
P23MBA547 Predictive Analytics
133 pages
DL EDA Process
No ratings yet
DL EDA Process
2 pages
Mastering Exploratory Data Analysis
No ratings yet
Mastering Exploratory Data Analysis
24 pages
Eda U1
No ratings yet
Eda U1
144 pages
22amh32 - Data Analytics and Data Science Unit I & Exploratory Data Analysis (Eda) 1. Exploratory Data Analysis (Eda)
No ratings yet
22amh32 - Data Analytics and Data Science Unit I & Exploratory Data Analysis (Eda) 1. Exploratory Data Analysis (Eda)
9 pages
Exploratory Data Analysis
No ratings yet
Exploratory Data Analysis
4 pages
Machine Learning: Technical Requirements & Data Processing Guide
No ratings yet
Machine Learning: Technical Requirements & Data Processing Guide
30 pages
Introduction To EDA: Exploratory Data Analysis (EDA) in Data Science
No ratings yet
Introduction To EDA: Exploratory Data Analysis (EDA) in Data Science
4 pages
Tableau EDA Guide for Analysts
No ratings yet
Tableau EDA Guide for Analysts
16 pages
Nikita Prasad - Exploratory Data Analysis (EDA)
No ratings yet
Nikita Prasad - Exploratory Data Analysis (EDA)
18 pages
Amit Khilare Used Device Data PM Project
No ratings yet
Amit Khilare Used Device Data PM Project
25 pages
Naan Mudhalvan - Google Cloud Data Analytics
No ratings yet
Naan Mudhalvan - Google Cloud Data Analytics
33 pages
Wa0000.
No ratings yet
Wa0000.
15 pages
CS202 Assignment - 4 - GIKI
No ratings yet
CS202 Assignment - 4 - GIKI
3 pages
Slides Session 12
No ratings yet
Slides Session 12
4 pages
Slides Session 4
No ratings yet
Slides Session 4
6 pages
Slides Session 14
No ratings yet
Slides Session 14
6 pages
Slides Session 5
No ratings yet
Slides Session 5
26 pages
NLP Roadmap
No ratings yet
NLP Roadmap
10 pages
Report (Syst Anal and Desiig 1)
No ratings yet
Report (Syst Anal and Desiig 1)
12 pages
Casual Inference Important
No ratings yet
Casual Inference Important
12 pages

Exploratory Data Analysis

Uploaded by

Exploratory Data Analysis

Uploaded by

You might also like