0% found this document useful (0 votes)

30 views20 pages

Da Program Upto 6

The document outlines a Data Analytics Lab course focused on data preprocessing, imputation models, and regression techniques using Python. It includes practical implementations for handling missing values, noise detection, and redundant data elimination, along with examples of Linear and Logistic Regression. Additionally, it covers the Decision Tree Classifier and provides explanations of key concepts such as dataframes, numpy, pandas, and sklearn.

Uploaded by

Samudrala Srujana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views20 pages

Da Program Upto 6

Uploaded by

Samudrala Srujana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 20

COURSE NAME: DATA ANALYTICS LAB COURSE CODE: 22ML607PC

Write a python programs for the following

1. Data Preprocessing

a. Handling missing values

b. Noise detection removal

c. Identifying data redundancy and elimination

import numpy as np

import pandas as pd

from sklearn.impute import SimpleImputer

from scipy import stats

from sklearn.preprocessing import StandardScaler

# Sample dataset with missing values, noise, and redundant data

data = {

'A': [10, 15, np.nan, 20, 25, 30, np.nan, 35, 1000], # Contains missing values & an outlier (1000)

'B': [5, 7, 8, 5, 10, 12, 8, 5, 15], # No missing values

'C': [10, 15, 20, 25, 30, 35, 40, 45, 50], # Highly correlated with A (redundant)

'D': ['yes', 'no', np.nan, 'yes', 'no', 'yes', 'no', 'yes', 'no'] # Categorical with missing values

df = pd.DataFrame(data)

print("Original Data:\n", df)

# ---------- Handling Missing Values ----------

imputer_numeric = SimpleImputer(strategy='mean') # Using mean imputation for numerical columns

df[['A']] = imputer_numeric.fit_transform(df[['A']]) # Fill missing values in column 'A'

imputer_categorical = SimpleImputer(strategy='most_frequent') # Using mode imputation for

categorical data

df[['D']] = imputer_categorical.fit_transform(df[['D']]) # Fill missing values in column 'D'

print("\nData After Handling Missing Values:\n", df)

# ---------- Noise Detection & Removal (Z-score method) ----------

z_scores = np.abs(stats.zscore(df[['A', 'B']])) # Compute Z-scores for numerical columns

df_no_noise = df[(z_scores < 3).all(axis=1)] # Remove rows where Z-score > 3

print("\nData After Removing Noise:\n", df_no_noise)

# ---------- Identifying & Removing Redundant Data ----------

correlation_matrix = df_no_noise.corr() # Compute correlation matrix

high_correlation_features = [col for col in correlation_matrix.columns if any(correlation_matrix[col] >

0.95) and col != 'A']

df_final = df_no_noise.drop(columns=high_correlation_features) # Drop highly correlated features

print("\nFinal Processed Data (Redundant Features Removed):\n", df_final)

OUTPUT:

Original Data:

A B C D

0 10.0 5 10 yes

1 15.0 7 15 no

2 NaN 8 20 NaN

3 20.0 5 25 yes

4 25.0 10 30 no

5 30.0 12 35 yes

6 NaN 8 40 no

7 35.0 5 45 yes

8 1000.0 15 50 no

Data After Handling Missing Values:

A B C D

0 10.000000 5 10 yes

1 15.000000 7 15 no

2 162.142857 8 20 no

3 20.000000 5 25 yes

4 25.000000 10 30 no

5 30.000000 12 35 yes

6 162.142857 8 40 no

7 35.000000 5 45 yes

8 1000.000000 15 50 no
Data After Removing Noise:

A B C D

0 10.000000 5 10 yes

1 15.000000 7 15 no

2 162.142857 8 20 no

3 20.000000 5 25 yes

4 25.000000 10 30 no

5 30.000000 12 35 yes

6 162.142857 8 40 no

7 35.000000 5 45 yes

8 1000.000000 15 50 no
2. Implement any one imputation model
import pandas as pd

import numpy as np

def mean_imputation(data):

"""Imputes missing values with the mean of each column."""

return data.fillna(data.mean())

# Example dataset with missing values

data = pd.DataFrame({

'A': [1, 2, np.nan, 4, 5],

'B': [3, np.nan, 7, 8, 9],

'C': [10, 11, 12, np.nan, 14]

})

print("Original Data:")

print(data)

# Apply mean imputation

imputed_data = mean_imputation(data)

print("\nData after Mean Imputation:")

print(imputed_data)

OUTPUT:
Original Data:

A B C

0 1.0 3.0 10.0

1 2.0 NaN 11.0

2 NaN 7.0 12.0

3 4.0 8.0 NaN

4 5.0 9.0 14.0

Data after Mean Imputation:

A B C

0 1.0 3.00 10.00

1 2.0 6.75 11.00

2 3.0 7.00 12.00

3 4.0 8.00 11.75

4 5.0 9.00 14.00

What is an imputer?

The imputer is an estimator used to fill the missing values in datasets. For numerical values, it uses
mean, median, and constant. For categorical values, it uses the most frequently used and constant
value. You can also train your model to predict the missing labels.

What is Numpy and Pandas?

NumPy and Pandas are two popular Python libraries often used in data analytics. It is used for working
with arrays. It also has functions for working in domain of linear algebra, fourier transform, and
matrices.NumPy excels in creating N-dimension data objects and performing mathematical operations
efficiently, while Pandas is renowned for data wrangling and its ability to handle large datasets.

What are the uses of sklearn?

It is one of the most useful library for machine learning in Python. The sklearn library contains a lot of
efficient tools for machine learning and statistical modeling including classification, regression, clustering
and dimensionality reduction. Generated sklearn datasets are synthetic datasets, generated using the
sklearn library in Python. They are used for testing, benchmarking and developing machine learning
algorithms/models.

What is a dataframe?

A dataframe is a data structure constructed with rows and columns, similar to a database or Excel
spreadsheet. DataFrames are similar to SQL tables or the spreadsheets that you work with in Excel or
Calc. Pandas DataFrame is a Two-dimensional data structure of mutable size and heterogeneous tabular
data

What is an dict{} in python?

dictionary can be created by placing a sequence of elements within curly {} braces, separated by a
'comma' Python dictionary are Ordered. Dictionary keys are case sensitive: the same name but different
cases of Key will be treated distinctly. With dictionaries you access values via the keys. The keys can be
of any datatype (int, float, string, and even tuple). A dictionary may contain duplicate values inside it,
but the keys MUST be unique (so it isn't possible to access different values via the same key).
3. Implement Linear Regression
import numpy as np

import matplotlib.pyplot as plt

from sklearn.linear_model import LinearRegression

from sklearn.model_selection import train_test_split

from sklearn.metrics import mean_squared_error

# Generate synthetic data

np.random.seed(42)

X = 2 * np.random.rand(100, 1)

y = 4 + 3 * X + np.random.randn(100, 1)

# Split data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train Linear Regression using sklearn

model = LinearRegression()

model.fit(X_train, y_train)

# Predict on test set

y_pred = model.predict(X_test)

# Calculate mean squared error

mse = mean_squared_error(y_test, y_pred)

print(f"Mean Squared Error: {mse}")

# Plot results

plt.scatter(X_test, y_test, color='blue', label='Actual')

plt.plot(X_test, y_pred, color='red', linewidth=2, label='Predicted')

plt.xlabel("X")

plt.ylabel("y")

plt.legend()

plt.show()

# Manual Implementation of Linear Regression using Normal Equation

X_b = np.c_[np.ones((100, 1)), X] # Add bias term

theta_best = np.linalg.inv(X_b.T.dot(X_b)).dot(X_b.T).dot(y)

print(f"Calculated coefficients: {theta_best.ravel()}")

what is linear regression?

Linear Regression is a supervised learning algorithm used for predicting a continuous dependent
variable based on one or more independent variables. It models the relationship between
variables by fitting a linear equation:

y= β0+β1x1+β2x2+...+βnxn+ϵ

where:

 y is the dependent variable (target),

 x i are independent variables (features),
 β i are coefficients (weights),
 ϵ epsilonϵ is the error term.
The goal of Linear Regression is to find the best-fitting line (or hyperplane in higher dimensions)
that minimizes the difference between predicted and actual values, often using methods like
Ordinary Least Squares (OLS).

Linear Regression

 Predicts continuous values.

 The model fits a straight line to the data.
4. Implement Logistic Regression
import numpy as np

import matplotlib.pyplot as plt

from sklearn.linear_model import LogisticRegression

from sklearn.model_selection import train_test_split

from sklearn.metrics import accuracy_score, confusion_matrix

# Generate synthetic data

np.random.seed(42)

X = 2 * np.random.rand(100, 1)

y = (X > 1).astype(int).ravel() # Binary classification based on threshold

# Split data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train Logistic Regression using sklearn

model = LogisticRegression()

model.fit(X_train, y_train)

# Predict on test set

y_pred = model.predict(X_test)

# Calculate accuracy
accuracy = accuracy_score(y_test, y_pred)

print(f"Accuracy: {accuracy}")

# Confusion matrix

conf_matrix = confusion_matrix(y_test, y_pred)

print("Confusion Matrix:")

print(conf_matrix)

# Plot decision boundary

X_values = np.linspace(0, 2, 100).reshape(-1, 1)

y_proba = model.predict_proba(X_values)[:, 1]

plt.scatter(X_test, y_test, color='blue', label='Actual')

plt.plot(X_values, y_proba, color='red', linewidth=2, label='Predicted Probability')

plt.xlabel("X")

plt.ylabel("Probability")

plt.legend()

plt.show()

What is Logistic Regression?

Logistic Regression is a supervised learning algorithm used for classification problems.

Unlike Linear Regression, which predicts continuous values, Logistic Regression predicts
probabilities and assigns data points to discrete classes (e.g., 0 or 1, spam or not spam, disease
or no disease).

Mathematical Formulation

Instead of a direct linear equation like in Linear Regression, Logistic Regression uses the
sigmoid (logistic) function to map outputs between 0 and 1:

P(y=1)=11+e−(β0+β1x1+β2x2+...+βnxn)P(y=1) = \frac{1}{1 + e^{-(\beta_0 + \beta_1 x_1 + \

beta_2 x_2 + ... + \beta_n x_n)}}P(y=1)=1+e−(β0+β1x1+β2x2+...+βnxn)1

where:

 P(y=1)P(y=1)P(y=1) is the probability that the output belongs to class 1.

 β0,β1,...,βn\beta_0, \beta_1, ..., \beta_nβ0,β1,...,βn are the model coefficients.
 x1,x2,...,xnx_1, x_2, ..., x_nx1,x2,...,xn are input features.
 The sigmoid function transforms the linear output into a probability range of (0,1)(0,1)
(0,1).

Classification Decision

Once the probability is computed, the decision boundary is set (commonly at 0.5):

 If P(y=1)≥0.5P(y=1) \geq 0.5P(y=1)≥0.5, classify as 1.

 If P(y=1)<0.5P(y=1) < 0.5P(y=1)<0.5, classify as 0.

Types of Logistic Regression

1. Binary Logistic Regression (Two classes, e.g., spam vs. not spam).
2. Multinomial Logistic Regression (More than two classes, e.g., cat, dog, horse).
3. Ordinal Logistic Regression (Ordered classes, e.g., low, medium, high risk).

Loss Function in Logistic Regression

Instead of Mean Squared Error (MSE) used in Linear Regression, Logistic Regression optimizes
the Log Loss (Cross-Entropy Loss):

It ensures the model penalizes wrong classifications more strongly.

Logistic Regression

 Predicts binary class probabilities.

 The model fits an S-shaped sigmoid curve.
***Linear Regression produces continuous outputs, while Logistic Regression produces probabilities
mapped to class labels

When to Use Logistic Regression?

✅ When you need classification (yes/no, pass/fail, fraud/not fraud).

✅ When the relationship between independent variables and output is non-linear but can be
mapped using probabilities.
✅ When the dataset is small to medium-sized, as logistic regression is computationally
efficient.

Logistic Regression is suitable for classification tasks, whereas Linear Regression is for regression tasks.

Real-Life Examples of Linear and Logistic Regression

📌 Linear Regression Examples (Predicting Continuous Values)

1. House Price Prediction

o Predicting house prices based on factors like size, location, number of bedrooms, and
age.

2. Stock Market Forecasting

o Predicting stock prices based on past trends, economic indicators, and company
performance.

3. Salary Prediction
o Estimating an employee's salary based on experience, education, and skills.

4. Temperature Prediction
o Forecasting the temperature of a city based on historical weather data, humidity, and
wind speed.

Logistic Regression Examples (Predicting Classifications)

1. Spam Detection 📩
o Classifying emails as spam (1) or not spam (0) based on word frequency and metadata.

2. Disease Diagnosis 🏥
o Predicting whether a patient has diabetes (1) or not (0) based on glucose levels, age, and
BMI.

3. Credit Card Fraud Detection 💳

o Identifying fraudulent transactions based on transaction patterns, location, and
frequency.

4. Customer Churn Prediction 📊

o Predicting whether a customer will continue (0) or cancel (1) a subscription based on
usage and complaints.

from sklearn.datasets import load_iris

from sklearn.tree import DecisionTreeClassifier, export_text

from sklearn.model_selection import train_test_split

from sklearn.metrics import accuracy_score

# Load dataset

data = load_iris()

X, y = data.data, data.target

# Split into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train Decision Tree classifier

clf = DecisionTreeClassifier(criterion='gini', max_depth=3, random_state=42)

clf.fit(X_train, y_train)

# Make predictions

y_pred = clf.predict(X_test)

# Evaluate accuracy
accuracy = accuracy_score(y_test, y_pred)

print(f'Accuracy: {accuracy:.2f}')

# Display the decision tree rules

print(export_text(clf, feature_names=data.feature_names))

NOTES:

Decision Tree Classifier:

A Decision Tree Classifier is a supervised learning algorithm used for

classification tasks. It works by splitting the data into subsets based on feature
values, forming a tree-like structure where:

Each internal node represents a decision based on a feature.

Each branch represents an outcome of that decision.

Each leaf node represents a class label (final prediction).

How It Works:

The algorithm selects the best feature to split the dataset using criteria like:

Gini Impurity (default in sklearn)

Entropy (Information Gain)

It recursively splits the data, forming a tree structure.

The process stops when:

A predefined depth is reached.

All samples in a node belong to the same class.

Further splits don’t improve accuracy.

Advantages:

Easy to understand and interpret

Requires little data preprocessing (no need for feature scaling)

Handles both numerical and categorical data

Disadvantages:

Prone to overfitting (solved using pruning or ensemble methods like Random

Forest)

Can be unstable with small data changes

What is an Iris dataset?

The Iris dataset is a well-known dataset in machine learning and statistics, used primarily for
classification tasks. It consists of 150 samples of iris flowers, categorized into three species:

Setosa

Versicolor

Virginica

Each sample has four features (measured in centimeters):

Sepal length

Sepal width

Petal length

Petal width

It is often used in educational contexts used for classification tasks because:

Simplicity It is small, well-structured, easy to understand and visualize.

It is built into scikit-learn, making it easy to access.

Balanced Classes – The dataset has three classes with roughly equal representation.

Benchmarking – Many algorithms have been tested on it, making it a good reference.

Well-Defined Features – The four numerical features (sepal length, sepal width, petal length,
petal width) provide clear distinctions between classes. The classes are well-separated, making it
a good dataset for testing classification algorithms.

However, you can use other datasets like:

Wine Dataset (sklearn.datasets.load_wine) – Good for multi-class classification.

Breast Cancer Dataset (sklearn.datasets.load_breast_cancer) – Used for binary classification.

Custom Data – You can use real-world datasets from CSV files or databases.
Implement Random Forest Classifier

from sklearn.datasets import load_iris

from sklearn.ensemble import RandomForestClassifier

from sklearn.model_selection import train_test_split

from sklearn.metrics import accuracy_score

# Load dataset

data = load_iris()

X, y = data.data, data.target

# Split into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train Random Forest classifier

clf = RandomForestClassifier(n_estimators=100, criterion='gini', max_depth=3, random_state=42)

clf.fit(X_train, y_train)

# Make predictions

y_pred = clf.predict(X_test)
# Evaluate accuracy

accuracy = accuracy_score(y_test, y_pred)

print(f'Accuracy: {accuracy:.2f}')

NOTE: In your code to implement a Random Forest classifier instead of a Decision Tree classifier.

NOTES:

A Random Forest Classifier is an ensemble learning method that builds multiple decision trees and
combines their predictions to improve accuracy and reduce overfitting. Here's how it works:

Bootstrap Sampling – The dataset is randomly sampled with replacement to create multiple training
subsets.

Multiple Decision Trees – A decision tree is trained on each subset.

Random Feature Selection – Each tree considers a random subset of features at each split, increasing
diversity among trees.

Voting/Averaging – For classification, the majority vote from all trees determines the final prediction.

Advantages of Random Forest

Reduces overfitting compared to a single decision tree

Handles missing values and large datasets well

Works for both classification and regression tasks

Can measure feature importance

DataAnalytics Lab Manual
No ratings yet
DataAnalytics Lab Manual
35 pages
Data Analytics Lab: Handling Missing Data
No ratings yet
Data Analytics Lab: Handling Missing Data
47 pages
Data Analytics Lab Manual - 250402 - 095326
No ratings yet
Data Analytics Lab Manual - 250402 - 095326
58 pages
DA Lab
No ratings yet
DA Lab
27 pages
DA Programs
No ratings yet
DA Programs
44 pages
Data - Analytics Lab - Manual JNTUH R22 Regulation
No ratings yet
Data - Analytics Lab - Manual JNTUH R22 Regulation
26 pages
Data Science Experiment Guide
100% (2)
Data Science Experiment Guide
43 pages
Data Wrangling and Imputation Techniques
100% (1)
Data Wrangling and Imputation Techniques
41 pages
Handle Missing Data in Real-Time
No ratings yet
Handle Missing Data in Real-Time
5 pages
Machine Learning Lab Experiments Guide
No ratings yet
Machine Learning Lab Experiments Guide
47 pages
Machine Learning Lab File
No ratings yet
Machine Learning Lab File
45 pages
AIDS - DM Using Python - Lab Programs
No ratings yet
AIDS - DM Using Python - Lab Programs
19 pages
Data Preprocessing For Machine Learning in Python
No ratings yet
Data Preprocessing For Machine Learning in Python
27 pages
ASSi2 DSBDA
No ratings yet
ASSi2 DSBDA
4 pages
Data Mining with Python Lab Guide
No ratings yet
Data Mining with Python Lab Guide
39 pages
Academic Performance Data Wrangling
No ratings yet
Academic Performance Data Wrangling
9 pages
Python in Research
No ratings yet
Python in Research
18 pages
Data Preprocessing for Beginners
No ratings yet
Data Preprocessing for Beginners
3 pages
DS Problem Statements and Codes
No ratings yet
DS Problem Statements and Codes
21 pages
ML (Sudhanshu)
No ratings yet
ML (Sudhanshu)
24 pages
ML 8 Program
No ratings yet
ML 8 Program
5 pages
Data Mining Lab Manual 2 2
No ratings yet
Data Mining Lab Manual 2 2
63 pages
Lecture Material 10
No ratings yet
Lecture Material 10
9 pages
Slides On DataII
No ratings yet
Slides On DataII
26 pages
Data Preprocessing 1
No ratings yet
Data Preprocessing 1
6 pages
Data Preprocessing Techniques in Python
No ratings yet
Data Preprocessing Techniques in Python
27 pages
Data Analytics Lab Manual
No ratings yet
Data Analytics Lab Manual
26 pages
Machine Exercise 3
No ratings yet
Machine Exercise 3
22 pages
Da - Week 9
No ratings yet
Da - Week 9
20 pages
Data Analysis for Beginners
No ratings yet
Data Analysis for Beginners
8 pages
ML File Syllabus
No ratings yet
ML File Syllabus
43 pages
Advanced Machine Learning Course Guide
No ratings yet
Advanced Machine Learning Course Guide
36 pages
Data Mining Lab: Regression & Clustering
No ratings yet
Data Mining Lab: Regression & Clustering
36 pages
DA Lab Manual r22
No ratings yet
DA Lab Manual r22
31 pages
PW2 DataCleaning
No ratings yet
PW2 DataCleaning
6 pages
ADS EXP Assignments
No ratings yet
ADS EXP Assignments
38 pages
The Complete Guide To Data Preprocessing
No ratings yet
The Complete Guide To Data Preprocessing
50 pages
Da Rec
No ratings yet
Da Rec
29 pages
External
No ratings yet
External
11 pages
Machine Learning Project Checklist
No ratings yet
Machine Learning Project Checklist
30 pages
DM Record Final
No ratings yet
DM Record Final
68 pages
Building Good Training Sets UNIT 1 PART2
No ratings yet
Building Good Training Sets UNIT 1 PART2
46 pages
FYMCA IDSLab A6 Submission
No ratings yet
FYMCA IDSLab A6 Submission
9 pages
Data Mining Lab Manual CSE VII Sem
No ratings yet
Data Mining Lab Manual CSE VII Sem
63 pages
Data Mining Lab 03
No ratings yet
Data Mining Lab 03
10 pages
Bussiness Report PM
No ratings yet
Bussiness Report PM
44 pages
Ad3411 - Dsa Lab Manual
No ratings yet
Ad3411 - Dsa Lab Manual
34 pages
Class Xii PDF For Practical
No ratings yet
Class Xii PDF For Practical
24 pages
DSBDA Lab Assignment No 2
No ratings yet
DSBDA Lab Assignment No 2
7 pages
ML Self Unit 2
No ratings yet
ML Self Unit 2
20 pages
FDS Slot 1
No ratings yet
FDS Slot 1
19 pages
Exp-2 ML
No ratings yet
Exp-2 ML
6 pages
Exp 2
No ratings yet
Exp 2
6 pages
MLC Practical
No ratings yet
MLC Practical
51 pages
Statistics IMP Questions and Answers
No ratings yet
Statistics IMP Questions and Answers
23 pages
DSBDA Practicals
No ratings yet
DSBDA Practicals
16 pages
ML Lab Record
No ratings yet
ML Lab Record
38 pages
Data Preprocesing JavaPoint
No ratings yet
Data Preprocesing JavaPoint
19 pages
ML LAB Manual-1
No ratings yet
ML LAB Manual-1
33 pages
Exam Grades Statistical Analysis Report
No ratings yet
Exam Grades Statistical Analysis Report
14 pages
Improved Likelihood Inference in Beta Regression: Journal of Statistical Computation and Simulation
No ratings yet
Improved Likelihood Inference in Beta Regression: Journal of Statistical Computation and Simulation
14 pages
The Influence of Service Quality and Customer Satisfaction On Repurchase Intentions
No ratings yet
The Influence of Service Quality and Customer Satisfaction On Repurchase Intentions
6 pages
Cross-Sectional Study Quality Guide
No ratings yet
Cross-Sectional Study Quality Guide
2 pages
SPSS Practical Procedures
No ratings yet
SPSS Practical Procedures
17 pages
Minitab DOE Tutorial2015
No ratings yet
Minitab DOE Tutorial2015
3 pages
The Independence of Irrelevant Alternatives - 230919 - 191757
No ratings yet
The Independence of Irrelevant Alternatives - 230919 - 191757
26 pages
Scratch Hardness SPC
No ratings yet
Scratch Hardness SPC
1 page
Chapter 10
No ratings yet
Chapter 10
7 pages
Pre-Test & Post Test Analysis Exam
No ratings yet
Pre-Test & Post Test Analysis Exam
7 pages
12 Bias-Variance - Underfit - Overfit
No ratings yet
12 Bias-Variance - Underfit - Overfit
4 pages
Business Analysis & Decision Making
67% (3)
Business Analysis & Decision Making
8 pages
Probablity and Sampling
No ratings yet
Probablity and Sampling
53 pages
Uji Normalitas dan Regresi Linear Y
No ratings yet
Uji Normalitas dan Regresi Linear Y
6 pages
SAT Score Analysis for Educators
No ratings yet
SAT Score Analysis for Educators
4 pages
Worksheet On Regression
No ratings yet
Worksheet On Regression
2 pages
One Way ANOVA SPSS Instructions
No ratings yet
One Way ANOVA SPSS Instructions
4 pages
Skill Based Projects - Data - Science (See List On Last Page)
No ratings yet
Skill Based Projects - Data - Science (See List On Last Page)
4 pages
R Guide: Multiple Linear Regression
No ratings yet
R Guide: Multiple Linear Regression
12 pages
Final Exam - Probability and Statistics
100% (1)
Final Exam - Probability and Statistics
2 pages
Normal Distribution Activity Guide
No ratings yet
Normal Distribution Activity Guide
7 pages
CSCI322 - Lecture 2
No ratings yet
CSCI322 - Lecture 2
38 pages
History of Regression
No ratings yet
History of Regression
13 pages
Machine Learning Exam Paper 2022
No ratings yet
Machine Learning Exam Paper 2022
1 page
Maths Made Easy by Ashish Pandey
0% (2)
Maths Made Easy by Ashish Pandey
242 pages
Understanding Measurement Scales: Nominal to Ratio
No ratings yet
Understanding Measurement Scales: Nominal to Ratio
4 pages
S1 Regression Past Paper Questions
No ratings yet
S1 Regression Past Paper Questions
9 pages
Paired T Test
No ratings yet
Paired T Test
19 pages
Descriptive Statistics and Normality Tests For Statistical Data
No ratings yet
Descriptive Statistics and Normality Tests For Statistical Data
13 pages
Sampling Design I
No ratings yet
Sampling Design I
25 pages