0% found this document useful (0 votes)

117 views11 pages

BDS-Homework-1-Submission - Ipynb - Colab

Uploaded by

nikita.andhale

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

117 views11 pages

BDS-Homework-1-Submission - Ipynb - Colab

Uploaded by

nikita.andhale

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

10/17/24, 9:12 AM BDS-Homework-1-Submission.

ipynb - Colab

Abhiram Iyengar

Anshul Joshi

Nikita Andhale

keyboard_arrow_down BDS: Homework 1

Submit:

1. A pdf of your notebook with solutions.

2. A link to your colab notebook

Goals of this homework

1. More experience with regression and ridge regression (regularization)
2. Start playing with Kaggle
3. More experience with Lasso.
4. An initial shot at ensembling and stacking.

Problem 1 (Nothing to turn in)

Go through all the notebooks we have done in class and make sure you understand what we did, and why.

keyboard_arrow_down Problem 2: Starting in Kaggle.

Later this month, we are opening a Kaggle competition made for this class. In that one, you will be participating on your own. This is an intro to
get us started, and also an excuse to work with regularization and regression which we have been discussing.

1. Let’s start with our first Kaggle submission in a playground regression competition. Make an account to Kaggle and find
https://www.kaggle.com/c/house-prices-advanced-regression-techniques/

2. Follow the data preprocessing steps from https://www.kaggle.com/code/apapiu/regularized-linear-models. Then run a ridge regression
using λ = 0.1 . Make a submission of this prediction, what is the RMSE you get? (Hint: remember to exponentiate np.expm1(ypred) your
predictions).

import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib

import matplotlib.pyplot as plt

from scipy.stats import skew
from scipy.stats.stats import pearsonr

%config InlineBackend.figure_format = 'retina' #set 'png' here when working on notebook

%matplotlib inline

<ipython-input-1-b12170f47c6f>:8: DeprecationWarning: Please import `pearsonr` from the `scipy.stats` namespace; the `scipy.stats.stats`
from scipy.stats.stats import pearsonr

train = pd.read_csv("train.csv")
test = pd.read_csv("test.csv")

train.head()

Id MSSubClass MSZoning LotFrontage LotArea Street Alley LotShape LandContour Utilities ... PoolArea PoolQC Fence MiscFeat

0 1 60 RL 65.0 8450 Pave NaN Reg Lvl AllPub ... 0 NaN NaN

1 2 20 RL 80.0 9600 Pave NaN Reg Lvl AllPub ... 0 NaN NaN

2 3 60 RL 68.0 11250 Pave NaN IR1 Lvl AllPub ... 0 NaN NaN

3 4 70 RL 60.0 9550 Pave NaN IR1 Lvl AllPub ... 0 NaN NaN

4 5 60 RL 84.0 14260 Pave NaN IR1 Lvl AllPub ... 0 NaN NaN

5 rows × 81 columns

all_data = pd.concat((train.loc[:,'MSSubClass':'SaleCondition'],
test.loc[:,'MSSubClass':'SaleCondition']))

First I'll transform the skewed numeric features by taking log(feature + 1) - this will make the features more normal

matplotlib.rcParams['figure.figsize'] = (12.0, 6.0)

prices = pd.DataFrame({"price":train["SalePrice"], "log(price + 1)":np.log1p(train["SalePrice"])})
prices.hist()

array([[<Axes: title={'center': 'price'}>,

<Axes: title={'center': 'log(price + 1)'}>]], dtype=object)

Create Dummy variables for the categorical features

#log transform the target:

train["SalePrice"] = np.log1p(train["SalePrice"])

#log transform skewed numeric features:

numeric_feats = all_data.dtypes[all_data.dtypes != "object"].index

skewed_feats = train[numeric_feats].apply(lambda x: skew(x.dropna())) #compute skewness

skewed_feats = skewed_feats[skewed_feats > 0.75]
skewed_feats = skewed_feats.index

all_data[skewed_feats] = np.log1p(all_data[skewed_feats])

Replace the numeric missing values (NaN's) with the mean of their respective columns

all_data = pd.get_dummies(all_data)
#filling NA's with the mean of the column:

#creating matrices for sklearn:

X_train = all_data[:train.shape[0]]
X_test = all_data[train.shape[0]:]
y = train.SalePrice

Models Now we are going to use regularized linear regression models from the scikit learn module. I'm going to try both l_1(Lasso) and
l_2(Ridge) regularization. I'll also define a function that returns the cross-validation rmse error so we can evaluate our models and pick the best
tuning par

from sklearn.linear_model import Ridge, RidgeCV, ElasticNet, LassoCV, LassoLarsCV

from sklearn.model_selection import cross_val_score

def rmse_cv(model):
rmse= np.sqrt(-cross_val_score(model, X_train, y, scoring="neg_mean_squared_error", cv = 5))
return(rmse)
model_ridge = Ridge()

The main tuning parameter for the Ridge model is alpha - a regularization parameter that measures how flexible our model is. The higher the
regularization the less prone our model will be to overfit. However it will also lose flexibility and might not capture all of the signal in the data.

alphas = [0.05, 0.1, 0.3, 1, 3, 5, 10, 15, 30, 50, 75]

cv_ridge = [rmse_cv(Ridge(alpha = alpha)).mean()
for alpha in alphas]

cv_ridge = pd.Series(cv_ridge, index = alphas)

cv_ridge.plot(title = "Validation - Just Do It")
plt.xlabel("alpha")
plt.ylabel("rmse")

Text(0, 0.5, 'rmse')

cv_ridge.min()

0.12731233261727531

So for the Ridge regression we get a rmsle of about 0.127 with alpha = 0.5

keyboard_arrow_down run a ridge regression using λ=0.1 . Make a submission of this prediction, what is the
RMSE you get?
#(Hint: remember to exponentiate np.expm1(ypred) your predictions).

from sklearn.linear_model import Ridge

from sklearn.model_selection import cross_val_score
import numpy as np

# Initialize Ridge model with alpha (λ) = 0.1

ridge_model = Ridge(alpha=0.1)

# Fit the Ridge regression model to the training data

ridge_model.fit(X_train, y)

▾ Ridge i ?

Ridge(alpha=0.1)

# Make predictions on the test set

ridge_preds = ridge_model.predict(X_test)

# Exponentiate the predictions to reverse the log1p transformation

ridge_preds_exp = np.expm1(ridge_preds)

# Prepare the submission file

submission = pd.DataFrame({"Id": test["Id"], "SalePrice": ridge_preds_exp})

# Save the submission to a CSV file

submission.to_csv("ridge_submission.csv", index=False)

def rmse_cv(model):
rmse = np.sqrt(-cross_val_score(model, X_train, y, scoring="neg_mean_squared_error", cv=5))
return rmse

# Calculate RMSE for Ridge model

rmse_ridge = rmse_cv(ridge_model).mean()
print(f"RMSE for Ridge Regression with λ=0.1: {rmse_ridge}")

RMSE for Ridge Regression with λ=0.1: 0.13774989813144883

House Prices - Advanced Regression Techniques : Kaggle Score = 0.13564

keyboard_arrow_down Problem 3: Continuing in Kaggle

1. Compare a ridge regression and a lasso regression model. Optimize the regularization constants using cross validation. This means that
you will have to select different values of the regularization parameters, and set up a k -fold cross validation experiment to decide which of
these is best, and then finally compare your best ridge regression model with your best lasso regression model.

What is the best score you can get from a single ridge regression model and from a single lasso model?

2. The ℓ0 (or L0 ) norm is the number of nonzeros of a vector. Plot the L0 norm of the coefficients that lasso produces as you vary the
strength of regularization parameter λ.

PROBLEM 3 : 1. Let' try out the Lasso model. We will do a slightly different approach here and use the built in
keyboard_arrow_down Lasso CV to figure out the best alpha for us. For some reason the alphas in Lasso CV are really the inverse or
the alphas in Ridge.

model_lasso = LassoCV(alphas = [1, 0.1, 0.001, 0.0005]).fit(X_train, y)

0.1225674790699958

Nice! The lasso performs even better so we'll just use this one to predict on the test set. Another neat thing about the Lasso is that it does
feature selection for you - setting coefficients of features it deems unimportant to zero. Let's take a look at the coefficients:

coef = pd.Series(model_lasso.coef_, index = X_train.columns)

print("Lasso picked " + str(sum(coef != 0)) + " variables and eliminated the other " + str(sum(coef == 0)) + " variables")

Lasso picked 110 variables and eliminated the other 177 variables

optimal_alpha = model_lasso.alpha_
print(f"Optimal alpha for Lasso: {optimal_alpha}")

Optimal alpha for Lasso: 0.0005

Let's find the best alpha with Ridge model & cross-validation.

from sklearn.linear_model import RidgeCV

import numpy as np

# Define a range of alpha values to test

alphas = [0.1, 1.0, 10.0, 100.0]

# Initialize RidgeCV model with the specified alphas

ridge_cv = RidgeCV(alphas=alphas, scoring='neg_mean_squared_error', cv=5)

# Fit the model to the training data

ridge_cv.fit(X_train, y)

# Get the best alpha value

best_alpha_ridge = ridge_cv.alpha_
print(f"Optimal alpha for Ridge: {best_alpha_ridge}")

# Calculate RMSE for the best Ridge model

rmse_ridge_cv = np.sqrt(-cross_val_score(ridge_cv, X_train, y, scoring="neg_mean_squared_error", cv=5)).mean()
print(f"RMSE for Ridge Regression with best alpha: {rmse_ridge_cv}")

Optimal alpha for Ridge: 10.0

RMSE for Ridge Regression with best alpha: 0.12731233261727531

PROBLEM 3 : 1) What is the best score you can get from a single ridge regression model and from a single lasso
model?

Best Score Comparison

The best score (lowest RMSE) between these two models is achieved by the Lasso regression model: Best RMSE: 0.1225674790699958 (Lasso
model) The Lasso model outperforms the Ridge model by a small margin in this case.

Optimal alpha for Ridge: 10.0

RMSE for Ridge Regression with best alpha: 0.12731233261727531

Optimal alpha for Lasso: 0.0005

RMSE for Lasso Regression with best alpha:0.1225674790699958

keyboard_arrow_down PROBLEM 3 : 2) The ℓ0 (or L0 ) norm is the number of nonzeros of a vector. Plot the L0 norm of the coefficients
that lasso produces as you vary the strength of regularization parameter λ .

import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import Lasso

# Define a range of alpha values (regularization parameters)

alphas = np.logspace(-4, 1, 50)

# Loop over each alpha value

for alpha in alphas:
# Initialize and fit the Lasso model
lasso = Lasso(alpha=alpha, max_iter=10000)
lasso.fit(X_train, y) # Using X_train and y as per your previous code

# Calculate the L0 norm (number of non-zero coefficients)

l0_norm = np.sum(lasso.coef_ != 0)
l0_norms.append(l0_norm)

# Plot the results

plt.figure(figsize=(10, 6))
plt.plot(alphas, l0_norms, marker='o')
plt.xscale('log')
plt.xlabel('Regularization Parameter (λ)')
plt.ylabel('L0 Norm of Coefficients')
plt.title('L0 Norm vs Regularization Strength in Lasso Regression')
plt.grid(True)
plt.show()

Trends:

High L0 Norm at Low λ: Minimal regularization leads to nearly all coefficients being non-zero.

Decreasing L0 Norm with Increasing λ: As λ increases, more coefficients are set to zero, showcasing Lasso's feature selection capability.

Plateau at High λ: Beyond around 10^-1, the number of non-zero coefficients stabilizes near zero, indicating strong regularization and the
exclusion of most features.

Interpretation:

Feature Selection: Lasso effectively reduces features by zeroing out coefficients as λ increases.

Model Complexity: Lower λ values yield complex models with more features, while higher λ values simplify the model.

Optimal Regularization: The ideal λ balances retaining essential features and eliminating noise, typically where the curve flattens.

keyboard_arrow_down Problem 4: Introduction to Stacking and Ensembling

Add the outputs of your models as features and train a ridge regression on all the features plus the model outputs (This is called Ensembling
and Stacking). Be careful not to overfit. What score can you get?

import numpy as np
import pandas as pd
from sklearn.linear_model import Ridge, Lasso
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.metrics import mean_squared_error

# Assume X_train, y_train, X_test are defined

# Train Ridge and Lasso models
ridge_model = Ridge(alpha=10).fit(X_train, y)
lasso_model = Lasso(alpha=0.0005).fit(X_train, y)

# Generate predictions on training data

ridge_preds_train = ridge_model.predict(X_train)
lasso_preds_train = lasso_model.predict(X_train)

# Generate predictions on test data

ridge_preds_test = ridge_model.predict(X_test)
lasso_preds_test = lasso_model.predict(X_test)

# Create new feature sets

X_train_stack = np.hstack((X_train, ridge_preds_train.reshape(-1, 1), lasso_preds_train.reshape(-1, 1)))
X_test_stack = np.hstack((X_test, ridge_preds_test.reshape(-1, 1), lasso_preds_test.reshape(-1, 1)))

# Train final Ridge regression model on stacked features

ridge_final_model = Ridge(alpha=10).fit(X_train_stack, y)

# Evaluate performance using cross-validation

rmse_final = np.sqrt(-cross_val_score(ridge_final_model, X_train_stack, y, scoring='neg_mean_squared_error', cv=5)).mean()
print(f"RMSE for stacked model: {rmse_final}")

# Predict on test data using stacked model

final_predictions = ridge_final_model.predict(X_test_stack)

# Prepare submission file with IDs and predicted SalePrice

submission = pd.DataFrame({"Id": test["Id"], "SalePrice": final_predictions})
submission.to_csv("stacked_submission.csv", index=False)

RMSE for stacked model: 0.12356315812543787

Kaggle Score after Stack Submission : 0.12496

Problem 5

Use the data generation used in the LASSO notebook where we first introduced Lasso, to generate data.

You can find that in the pages tab in Canvas.

1. Manually implement forward selection. Report the order in which you add features.
2. In this example, we know the true support size is 5. But what if we did not know this? Plot test error as a function of the size of the
support. Use this to recover the true support size. Justify your answer.
3. Use Lasso with a manually implemented Cross validation using the metric of your choice. What is the value of the hyperparameter?
(Manually implemented means that you can either do it entirely on your own, or you can use GridSearchCV, but I’m asking you not to use
LassoCV, which you will use in the next problem).
4. (Optional) Change the number of folds in your CV and repeat the previous step. How does the optimal value of the hyperparameter
change? Try to explain any trends that you find.
5. (Optional) Read about and use LassoCV from sklearn.linear model. How does this compare with what you did in the previous step? If they
agree, then explain why they agree, and if they disagree explain why. This will require you to make sure you understand what LassoCV is
doing.

keyboard_arrow_down Step 0: Generate Data

np.random.seed(7)

n_samples, n_features = 100, 200

X = np.random.randn(n_samples, n_features)

k = 5
# beta generated with k nonzeros
#coef = 10 * np.random.randn(n_features)
coef = 10 * np.ones(n_features)
inds = np.arange(n_features)
np.random.shuffle(inds)
coef[inds[k:]] = 0 # sparsify coef
y = np.dot(X, coef)

# add noise
y += 0.01 * np.random.normal((n_samples,))

# Split data in train set and test set

n_samples = X.shape[0]
X_train, y_train = X[:25], y[:25]
X_test, y_test = X[25:], y[25:]

keyboard_arrow_down Step 1: Manually Implement Forward Selection

import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

# Assuming X_train, y_train are already defined as in the previous code

# Forward selection implementation

selected_features = []
corresponding_mse = []
remaining_features = list(range(X_train.shape[1]))

# Limiting selection to top 10 for demonstration purposes

for _ in range(10):
best_feature = None
best_mse = float('inf')
for feature in remaining_features:
current_features = selected_features + [feature]
model = LinearRegression().fit(X_train[:, current_features], y_train)
y_pred = model.predict(X_train[:, current_features])
mse = mean_squared_error(y_train, y_pred)
if mse < best_mse:
best_mse = mse
best_feature = feature
selected_features.append(best_feature)
corresponding_mse.append(best_mse)

remaining_features.remove(best_feature)

print("Selected features:", selected_features)

print("MSE for Selected features:", corresponding_mse)

Selected features: [15, 18, 78, 76, 29, 80, 55, 0, 27, 62]
MSE for Selected features: [274.9259047713406, 127.87410588555692, 50.25889203977732, 28.96253432071043, 17.198394502743234, 11.59848226

keyboard_arrow_down Step 2: Estimate the True Support Size by Plotting Test Error
import numpy as np
import matplotlib.pyplot as plt

# Data from the forward selection results

test_errors = corresponding_mse

# Support sizes for the feature selections

support_sizes = range(1, len(selected_features) + 1)

# Plot the test error as a function of the support size

https://colab.research.google.com/drive/1Rw5Ml1jBmQ868im5CIzRmmyRxowv-glk#scrollTo=KKaU8rEVvImI&printMode=true 8/11
10/17/24, 9:12 AM BDS-Homework-1-Submission.ipynb - Colab
plt.figure(figsize=(12, 6))
plt.plot(support_sizes, test_errors, marker='o')
plt.title('Test Error vs. Support Size')
plt.xlabel('Support Size')
plt.ylabel('Test Error (MSE)')
plt.yscale('log') # Log scale for better visualization
plt.grid(True)
plt.show()

# Find the minimum error and its corresponding support size

min_error = min(test_errors)
optimal_support_size = support_sizes[test_errors.index(min_error)]

print(f"Optimal support size: {optimal_support_size}")

print(f"Minimum test error: {min_error:.2e}")

Optimal support size: 10

Minimum test error: 8 15e 01

Based on the results, the optimal support size is indeed 10, with a minimum test error of approximately 8.15e-01. This indicates that as we
added more features, the test error continued to decrease, reaching its lowest value when all 10 selected features were used.

However, the key observation here is that while the error decreases steadily as more features are added, the improvement becomes less
pronounced after a certain number of features, indicating diminishing returns. Even though the optimal support size is 10 in this case, the
earlier features (around 5) seem to have the most significant impact on reducing the error, and additional features improve the model more
gradually.

keyboard_arrow_down Step 3: Lasso Regression with Manual Cross-Validation

from sklearn.preprocessing import StandardScaler

# Normalize the feature matrix

scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Perform cross-validation again with a wider range of alphas

alphas = np.logspace(-4, 1, 50)
param_grid = {'alpha': alphas}
grid_search = GridSearchCV(Lasso(max_iter=10000), param_grid, scoring='neg_mean_squared_error', cv=5, n_jobs=-1)
grid_search.fit(X_train_scaled, y_train)

https://colab.research.google.com/drive/1Rw5Ml1jBmQ868im5CIzRmmyRxowv-glk#scrollTo=KKaU8rEVvImI&printMode=true 9/11
10/17/24, 9:12 AM BDS-Homework-1-Submission.ipynb - Colab
# Find the best alpha and evaluate the test MSE
best_alpha = grid_search.best_params_['alpha']
lasso_best = Lasso(alpha=best_alpha, max_iter=10000)
lasso_best.fit(X_train_scaled, y_train)
y_pred_test = lasso_best.predict(X_test_scaled)

# Test MSE
test_mse = mean_squared_error(y_test, y_pred_test)
print(f"Best alpha (5 fold): {best_alpha}")
print(f"Test MSE with scaled features: {test_mse:.4f}")

Best alpha (5 fold): 0.005428675439323859

Test MSE with scaled features: 0.0012

keyboard_arrow_down Step 4: (Optional) Vary the Number of Folds in Cross-Validation

from sklearn.linear_model import Lasso
from sklearn.model_selection import KFold, GridSearchCV
from sklearn.metrics import mean_squared_error
from sklearn.preprocessing import StandardScaler
# Set up alpha ranges for Lasso
lasso_alphas = {'alpha': np.logspace(-4, 1, 50)}

# Define different number of folds for cross-validation

folds = [3, 5, 10]

# Store results
results = {}

for n_folds in folds:

# Define k-fold cross-validation
kf = KFold(n_splits=n_folds, shuffle=True, random_state=42)

# Lasso Regression with GridSearchCV

lasso = Lasso(max_iter=10000)
lasso_cv = GridSearchCV(lasso, lasso_alphas, cv=kf, scoring='neg_mean_squared_error')
lasso_cv.fit(X_train_scaled, y_train)

# Get best model and error

best_alpha = lasso_cv.best_params_['alpha']
test_pred = lasso_cv.predict(X_test_scaled)
test_mse = mean_squared_error(y_test, test_pred)

results[n_folds] = {'best_alpha': best_alpha, 'test_mse': test_mse}

# Print results
for n_folds, res in results.items():
print(f"Number of Folds: {n_folds}, Best Alpha: {res['best_alpha']}, Test MSE: {res['test_mse']:.4f}")

Number of Folds: 3, Best Alpha: 0.002682695795279727, Test MSE: 67.0029

Number of Folds: 5, Best Alpha: 0.05689866029018299, Test MSE: 0.0543
Number of Folds: 10, Best Alpha: 0.008685113737513529, Test MSE: 0.0018

Observations Variation in Optimal Alpha:

3 Folds: The best alpha found is 0.00268 with a relatively high test MSE of 67.00. This suggests that the model is likely too complex or not
effectively regularized for this dataset when using just three folds.

5 Folds: The best alpha increases significantly to 0.05690, resulting in a much lower test MSE of 0.0543. This indicates improved regularization,
as the model is now performing better with a more appropriate alpha value.

10 Folds: The optimal alpha is 0.00869, and the test MSE drops even further to 0.0018, showing excellent performance. This lower test MSE
reflects a better generalization of the model on unseen data.

keyboard_arrow_down Step 5: (Optional) Compare with LassoCV

from sklearn.linear_model import LassoCV
from sklearn.metrics import mean_squared_error, r2_score

# Create LassoCV object

# Fit the model

lasso_cv.fit(X_train_scaled, y_train)

# Make predictions
y_pred_cv = lasso_cv.predict(X_test_scaled)

# Calculate MSE and R2 score

mse_cv = mean_squared_error(y_test, y_pred_cv)
r2_cv = r2_score(y_test, y_pred_cv)

print(f"Best alpha: {lasso_cv.alpha_}")

print(f"MSE: {mse_cv}")
print(f"R2 Score: {r2_cv}")

Best alpha: 1.9306977288832496

MSE: 60.114048729555506
R2 Score: 0.8491435237962413

Given the results from two approaches:

GridSearchCV: Best alpha of 0.0054 with a test MSE of 0.0012. LassoCV: Best alpha of 1.9307 with a test MSE of 60.1140.

Brief Explanation of Discrepancy

The significant difference in the best alpha values and MSE results suggests that the two methods are identifying different optimal
hyperparameters for the Lasso model. Here are potential reasons for this discrepancy:

Regularization Sensitivity: Lasso regression is sensitive to the choice of the alpha parameter, which controls the strength of the penalty. The
vastly different optimal alphas indicate that the model is responding differently to the regularization effect in each approach.

Data Characteristics: The distribution of the features and the target variable can affect how regularization is applied. If the features have a wide
range or differing scales, it can lead to different model performance across methods.

Hyperparameter Exploration: The search strategies may lead to different regions in the alpha parameter space being explored. While both
methods utilize the same range, LassoCV optimizes based on a built-in cross-validation approach, potentially leading it to converge on a less
optimal solution compared to the grid search.

Variance in Cross-Validation: Even though both methods used 5-fold cross-validation, the specific splits and their interaction with the model
could lead to variability in performance estimates, especially with a small sample size.

https://colab.research.google.com/drive/1Rw5Ml1jBmQ868im5CIzRmmyRxowv-glk#scrollTo=KKaU8rEVvImI&printMode=true 11/11

Message
No ratings yet
Message
5 pages
Machine Learning Lab: Regression Analysis
No ratings yet
Machine Learning Lab: Regression Analysis
15 pages
Assignment 1
No ratings yet
Assignment 1
4 pages
Data Science Record - 05
No ratings yet
Data Science Record - 05
20 pages
Set 2
No ratings yet
Set 2
19 pages
T2 Summary VHA
No ratings yet
T2 Summary VHA
14 pages
DA Lab2
No ratings yet
DA Lab2
5 pages
Machine Learning Problem Set
No ratings yet
Machine Learning Problem Set
5 pages
IoT Task4 21BEC0384
No ratings yet
IoT Task4 21BEC0384
9 pages
ML Record
No ratings yet
ML Record
19 pages
Lasso Regression Homework
No ratings yet
Lasso Regression Homework
11 pages
California Housing Regression with Scikit-Learn
No ratings yet
California Housing Regression with Scikit-Learn
2 pages
Exercise4 Solution
No ratings yet
Exercise4 Solution
20 pages
Docu 4
No ratings yet
Docu 4
3 pages
Group Work Assignment Supervised and Unsupervised Learning
No ratings yet
Group Work Assignment Supervised and Unsupervised Learning
10 pages
Train
No ratings yet
Train
17 pages
Unit 3 5
No ratings yet
Unit 3 5
4 pages
LAB5 Regularization
No ratings yet
LAB5 Regularization
6 pages
Regression Analysis - Lasso and Ridge Regularization
No ratings yet
Regression Analysis - Lasso and Ridge Regularization
17 pages
Predicting Housin Main Project Ediglobe
No ratings yet
Predicting Housin Main Project Ediglobe
4 pages
ML Manual
No ratings yet
ML Manual
30 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
20 pages
SiddharthShah 1032221195 DivC 50 DL LabAssignment2
No ratings yet
SiddharthShah 1032221195 DivC 50 DL LabAssignment2
7 pages
Housing Price Prediction
No ratings yet
Housing Price Prediction
25 pages
Regression Techniques in Python Guide
No ratings yet
Regression Techniques in Python Guide
34 pages
Lasso Regression Aim: Roll Number: 160122733094 Date
No ratings yet
Lasso Regression Aim: Roll Number: 160122733094 Date
8 pages
Linear Regression Guide for Data Analysts
No ratings yet
Linear Regression Guide for Data Analysts
16 pages
Machine Learning
No ratings yet
Machine Learning
10 pages
Advanced Regression with IPL Data
No ratings yet
Advanced Regression with IPL Data
25 pages
Regression Model Training Guide
No ratings yet
Regression Model Training Guide
13 pages
ML Lab Experiment Shivansh
No ratings yet
ML Lab Experiment Shivansh
29 pages
ML Manual
No ratings yet
ML Manual
9 pages
Machine Learning Project: TITLE: Predicting The Sale Price of A House Using Linear Regression
No ratings yet
Machine Learning Project: TITLE: Predicting The Sale Price of A House Using Linear Regression
20 pages
Regression Models for Housing Prices
No ratings yet
Regression Models for Housing Prices
17 pages
Data Exploration with Python on Kaggle
No ratings yet
Data Exploration with Python on Kaggle
20 pages
Kritika Sejwal - 24MCI10023 - ML Lab - Worksheet 1
No ratings yet
Kritika Sejwal - 24MCI10023 - ML Lab - Worksheet 1
6 pages
Linear Regression for Data Science
No ratings yet
Linear Regression for Data Science
30 pages
Deepak Data Analysis 1
No ratings yet
Deepak Data Analysis 1
31 pages
Machine Learning Evaluation Guide
100% (1)
Machine Learning Evaluation Guide
504 pages
Python Data Analysis Guide
No ratings yet
Python Data Analysis Guide
171 pages
Document From Jahnavi
No ratings yet
Document From Jahnavi
20 pages
Zerox Ready
No ratings yet
Zerox Ready
21 pages
SML - Week 3
No ratings yet
SML - Week 3
5 pages
Lab ML
No ratings yet
Lab ML
26 pages
Linear Regression with Boston Housing Data
No ratings yet
Linear Regression with Boston Housing Data
14 pages
Lasso Regression in Machine Learning
No ratings yet
Lasso Regression in Machine Learning
14 pages
Integrated System Lab
No ratings yet
Integrated System Lab
25 pages
ML Manual
No ratings yet
ML Manual
29 pages
Project 4 - House Price Prediction - Ipynb - Colab
No ratings yet
Project 4 - House Price Prediction - Ipynb - Colab
5 pages
Boston House Prediction - Colab1
No ratings yet
Boston House Prediction - Colab1
10 pages
Data Analysis for Beginners
No ratings yet
Data Analysis for Beginners
1 page
ML Guide: Boston House Price Prediction
100% (1)
ML Guide: Boston House Price Prediction
15 pages
Machine Learning Laboratory Exercises
No ratings yet
Machine Learning Laboratory Exercises
16 pages
Ayush ML 5
No ratings yet
Ayush ML 5
8 pages
DL Assignment 1ms24rai03
No ratings yet
DL Assignment 1ms24rai03
10 pages
Matlab Homework Experts 2
No ratings yet
Matlab Homework Experts 2
10 pages
Master Data
No ratings yet
Master Data
520 pages
Midterm 2023 Fall
No ratings yet
Midterm 2023 Fall
8 pages
Lecture 1-2 - Introduction
No ratings yet
Lecture 1-2 - Introduction
72 pages
MIS382N Fall 2024 Syllabus
No ratings yet
MIS382N Fall 2024 Syllabus
4 pages
Updated Nikita MSITM CV Fortune Brands
No ratings yet
Updated Nikita MSITM CV Fortune Brands
2 pages
Gravetter Z-Table
No ratings yet
Gravetter Z-Table
4 pages
Module 1 Statistic PDF
No ratings yet
Module 1 Statistic PDF
7 pages
Population Pharmacokinetics II: Estimation Methods
No ratings yet
Population Pharmacokinetics II: Estimation Methods
9 pages
11
No ratings yet
11
34 pages
Descriptive Statistics and EDA Overview
No ratings yet
Descriptive Statistics and EDA Overview
36 pages
QFR rp181
No ratings yet
QFR rp181
32 pages
Shape of Data Skewness and Kurtosis
No ratings yet
Shape of Data Skewness and Kurtosis
6 pages
Formula Sheet Descriptive Statistics
No ratings yet
Formula Sheet Descriptive Statistics
1 page
Measures of Dispersion Range Variance and Standard Deviation
No ratings yet
Measures of Dispersion Range Variance and Standard Deviation
35 pages
Using AI-driven Chatbots To Foster Chinese EFL Students' Academic Engagement: An Intervention Study
No ratings yet
Using AI-driven Chatbots To Foster Chinese EFL Students' Academic Engagement: An Intervention Study
8 pages
The Role of Social Support in The Process of Work Stress: A Meta-Analysis
No ratings yet
The Role of Social Support in The Process of Work Stress: A Meta-Analysis
21 pages
KNN Interview Question Rev 2.0
No ratings yet
KNN Interview Question Rev 2.0
17 pages
Academic Guide to ANOVA Analysis
No ratings yet
Academic Guide to ANOVA Analysis
4 pages
Multivariate Regression Model - Lecture Notes
No ratings yet
Multivariate Regression Model - Lecture Notes
17 pages
Statistics Class 9+10
No ratings yet
Statistics Class 9+10
2 pages
Correlation and Regression Analysis Guide
No ratings yet
Correlation and Regression Analysis Guide
22 pages
Population Sample and Sampling Methods
No ratings yet
Population Sample and Sampling Methods
5 pages
ECON-330 OLS Regression Assignment
No ratings yet
ECON-330 OLS Regression Assignment
2 pages
Air Quality Index Prediction Using Machine Learning
No ratings yet
Air Quality Index Prediction Using Machine Learning
8 pages
JEE Statistics: Central Tendency
No ratings yet
JEE Statistics: Central Tendency
11 pages
Aldi PPT 1w2
No ratings yet
Aldi PPT 1w2
17 pages
The Following Information Relates To Questions 1 and 2
No ratings yet
The Following Information Relates To Questions 1 and 2
3 pages
Machine Learning Assignment
No ratings yet
Machine Learning Assignment
8 pages
Identifying Parameter For Testing in Given Real-Life Problems
No ratings yet
Identifying Parameter For Testing in Given Real-Life Problems
12 pages
Polygonal Numbers and Statistical Concepts
No ratings yet
Polygonal Numbers and Statistical Concepts
7 pages
Statistics Principles and Methods 8th Edition Richard A Johnson Ebook and TestBank Bundle PDF Download
No ratings yet
Statistics Principles and Methods 8th Edition Richard A Johnson Ebook and TestBank Bundle PDF Download
322 pages
ANOVA Guide: Statistical Analysis 2023
No ratings yet
ANOVA Guide: Statistical Analysis 2023
24 pages
Probability Distributions Guide
No ratings yet
Probability Distributions Guide
34 pages
Data Science Imp Questions and Answers
No ratings yet
Data Science Imp Questions and Answers
13 pages
Nilai Siswa PJBL (AutoRecovered)
No ratings yet
Nilai Siswa PJBL (AutoRecovered)
20 pages