0% found this document useful (0 votes)

30 views12 pages

Esami - R UNIPD

The document outlines exam questions for a Machine Learning for Bioengineering course, covering various datasets related to medical conditions such as breast cancer, liver disease, cardiac arrhythmia, and more. Each section includes tasks like data analysis, model evaluation, and performance comparison using techniques like k-NN, PCA, classification trees, and random forests. Students are required to justify their answers, provide code and figures, and submit their work via email.

Uploaded by

mazzapicaalessandro

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views12 pages

Esami - R UNIPD

Uploaded by

mazzapicaalessandro

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

MACHINE LEARNING FOR BIOENGINEERING

Exam, February 10, 2023: PART 2 (120 min)

In moodle you will find a dataset regarding breast cancer and denoted “cancer.dat”. The dataset consists of
geometrical characteristics of cell nuclei obtained from digitized images of breast mass of benign (B) and malignant
(M) tumors, as given in the response variable class. There are no missing data. The overall scope is to classify
tissue samples into one the two classes based on their geometrical features.

Answers must be justified and explained. You can write your answers on sheets of paper, or in a document on
the computer to be send with code and figures. Send code and figures by email to [email protected].

WRITE YOUR NAME ON ALL SHEETS OF PAPER!

1. (13 pts)

- What is the accuracy of the “chance” (also known as “naive” or “majority”) predictor for class?
- Split the data into training and test sets (use set.seed in your code for reproducibility).
- Perform a k-Nearest Neighbor (k-NN) analysis for k = 3 to predict class in the test set from the training
set. Compare your result to the “chance” predictor.
- Tune the k-NN classifier with respect to the test error. Comment on your result.

2. (10 pts)

- Perform a Principal Components Analysis (PCA) on the numerical part of the training data, i.e., assuming
that the true type of the class is unknown. How many Principal Components (PCs) would you use?
- Which of the original features mainly determine the first PC?

3. (12 pts)

- Perform a classification tree analysis of the dataset.

- Which are the most important features according to this analysis? Compare to the results from PCA.
- Does the performance improve if you use the first few principal components for the tree analysis?

4. (15 pts)

- Perform hierarchical cluster analysis with “complete-link clustering”, assuming that the true class is un-
known, and visualize the result.
- Inspecting the resulting dendogram, how many cluster would you choose? Why?
- Find the suggested number of clusters according to the silhouette method (do not use the function
fviz nbclust but write your own explicit code with a for loop).
- Do the clusters obtained in the two previous questions correspond to the class variable? Comment on
the results.

5. Bonus question – do only in case you have finished all the other ones!
Perform a support vector machine analysis predicting class and evaluate the final model on the test set.
MACHINE LEARNING FOR BIOENGINEERING
Exam, June 15, 2022: PART 2

In moodle you will find a dataset regarding the liver disease hepatitis and denoted “hepatitis.dat”. The dataset
consists of 155 patients with a response variable denoted “class” that indicates whether the individual died of the
disease, and 19 features. The aim is to predict “class” from the other features. There are missing data.

WRITE YOUR NAME ON ALL SHEETS OF PAPER!

1. (5 pts)
What is the accuracy of the naive (“chance” or “majority”) predictor?

2. (10 pts)
Perform a k-medoids analysis with two clusters using the 19 features on the complete data, i.e., after omitting
patients with NAs. Do the clusters correspond to the “class” variable? Comment on the result.

3. (10 pts)

- Perform a tree analysis on the complete data, i.e., after omitting patients with NAs.
- Evaluate the (training) accuracy of the obtained tree on the same dataset using a confusion matrix.
- Based on your tree analysis, do the features with many missing datapoints seem to be important for
classification?

4. (10 pts)

- Split the entire dataset (with NAs) into training and test sets (use set.seed in your code for reproducibil-
ity).
- Perform a tree analysis using surrogate splits on the training set.
- Choose a patient in your test set with NA in the variable of the primary split. Explain in details how this
individual is classified by the tree.
- Evaluate the final tree on the test set and comment on the results.

5. (15 pts)

- Perform a gradient boosting analysis (with trees) on the training and test sets generated for the previous
question (gbm uses surrogate splits).
- Evaluate the classifier with a ROC curve and comment on the result.
- Which threshold would you choose if you aim to predict correctly at least 75% of the cases that die?
- Which threshold would you choose to maximize the prediction accuracy?
- Which are the two most important features according to this analysis?
- Describe qualitatively how these two most important features influence the probability of dying according
to your analysis.

6. Bonus question – do only in case you have finished all the other ones!
Does the prediction based on gradient boosting improve if you use multiple imputation instead of surrogate
splits?
MACHINE LEARNING FOR BIOENGINEERING
Exam, June 30, 2022: PART 2

In moodle you will find a dataset regarding cardiac arrhythmia and denoted “arrhythmia.dat”. The dataset
consists of 452 patients with a response variable denoted “class” that indicates whether the individual had ar-
rhythmia according to the doctors, and 199 features (4 basic features: age, sex, height, weight; the feature
heart.rate; the remaining 194 features describe various features obtained from an ECG). There are missing data.

WRITE YOUR NAME ON ALL SHEETS OF PAPER!

1. (10 pts)

- What is the accuracy of the “chance” (also known as “naive” or “majority”) predictor for class?
- Evaluate Naive Bayes prediction using a training set and a test set. Compare to chance classification and
comment on the results.

2. (10 pts)

- Perform a linear regression analysis with heart.rate as a function of the 4 basic features (see above).
Find the optimal model according to AIC and BIC.
- Repeat the analysis of the previous point considering only adults (age 18 or older).
- Comment on your results, possibly using plots of heart.rate versus the basic features to help your
reasoning.

3. (15 pts)

- Perform a principal component analysis (PCA) using the numerical features that do not have missing
data. What would be a “good” number of principal components to consider?
- Perform a k-nearest neighbor analysis using all the numerical features without NAs to predict the class.
Evaluate with LOOCV.
- Perform a k-nearest neighbor analysis using the first few principal components to predict the class.
Evaluate with LOOCV. Comment on your results.

4. (15 pts)

- Split the dataset (excluding features with NAs) into training and test sets (use set.seed in your code for
reproducibility).
- Perform a support vector machine analysis predicting class and evaluate the final model on the test set.
Comment on the results regarding accuracy, sensitivity and specificity.

5. Bonus question – do only in case you have finished all the other ones!
Does the prediction based on SVM improve if you use multiple imputation to handle the features with
missing data (rather than excluding them)?
MACHINE LEARNING FOR BIOENGINEERING
Exam, August 30, 2023: PART 2

In moodle you will find a dataset regarding breastcancer, denoted “nki5y.dat”.

The dataset consists of 280 cases with a response variable denoted “death” that indicates if the patients survived
beyond 5 years after treatment was initiated (0: survived >5 years; 1: died before 5 years) and 12 features.
“patnr” is a patient ID number.

The aim is to analyze how the 5-year mortality (“death”) depends on the other features, and vice versa, to use
these features to predict “death”. There are no missing data.

Answers must be justified and explained. You can write your answers on sheets of paper, or in a document on the
computer to be send with code and figures.

>>> Send code and figures by email to [email protected]. <<<

WRITE YOUR NAME ON ALL SHEETS OF PAPER!

1. (5 pts)
What is the accuracy of the naive (“chance” or “majority”) predictor? What does it predict?

2. (20 pts)

- Split the dataset into training and test sets (use set.seed in your code for reproducibility).
- Perform a thorough logistic regression analysis using the training data.
- Evaluate the test accuracy of the final logistic model on the test set using a confusion matrix. Comment
on the results.
- Evaluate the classifier with a ROC curve and comment on the result. Which threshold would you choose
in order to classify at least 80% of positive cases correctly?

3. (10 pts)

- Perform a thorough tree classification analysis using the training data generated for the previous question.
- Comment on the correspondence between important variables as found by the tree analysis and the relevant
variables found from logistic modelling.
- Evaluate the test accuracy of the obtained tree on the test set using a confusion matrix. Comment on the
results.

4. (20 pts)

- Perform a gradient boosting analysis (with trees) on the training set (generated for question 2), and
evaluate the accuracy on the test set using a confusion matrix.
- Evaluate the classifier with a ROC curve and comment on the result.
- Which are the two most important features according to this analysis?
- Describe qualitatively how these two most important features influence the 5-years mortality according to
your analysis, and compare to the results found in questions 2 and 3.
MACHINE LEARNING FOR BIOENGINEERING
Exam, September 19, 2022: PART 2 (120 min)

In moodle you will find a dataset regarding raisins and denoted “raisin.dat”. The dataset consists geometrical
characteristics of 900 raisins of two types, “Besni” and “Kecimen”, given in the response variable “Class”. There
are no missing data. The overall scope is to classify raisins into one the two classes based on their geometrical
features.

WRITE YOUR NAME ON ALL SHEETS OF PAPER!

1. (10 pts)

- Perform a k-means analysis with two clusters, assuming that the true type of the raisins is unknown, and
visualize the result.
- Do the clusters correspond to the “Class” variable? Comment on the result.
- Find the optimal number of clusters.

2. (10 pts)

- Split the data into training and test sets (use set.seed in your code for reproducibility).
- Perform a logistic regression analysis on the training data, finding the optimal model according to AIC
and BIC.
- Evaluate predictions of the optimal model on the test set, and comment on your results.
- Show a ROC curve and calculate the AUC for the optimal model and comment on the results.

3. (15 pts)

- Perform a Principal Components Analysis (PCA) on the numerical part of the training data, i.e., assuming
that the true type of the raisins is unknown. How many Principal Components (PCs) would you use?
- Which of the original features mainly determine the first PC?
- Train a logistic regression model on the training data to predict the Class from the first two PCs. Evaluate
predictions of the model on the test set, and comment on your results.

4. (15 pts)

- Perform a Random Forest (RF) analysis of the dataset.

- Which are the most important parameters according to this analysis? Compare to the results from logistic
regression and PCA.
- Does the performance improve if you use the first few principal components for the RF analysis?

In moodle you will find a dataset regarding fetal health, denoted “fetalhealth.dat”. The dataset consists of
2126 cases with a response variable denoted “fetal health” that indicates the health status of the foetus (1: Nor-
mal, 2: Suspect, 3: Pathological), and 21 features mainly related to cardiotocograms (CTGs). The aim is to
predict “fetal health” from the other features. There are no missing data.

Answers must be justified and explained. You can write your answers on sheets of paper, or in a document on
the computer to be send with code and figures.

>>> Send code and figures by email to [email protected]. <<<

WRITE YOUR NAME ON ALL SHEETS OF PAPER!

1. (5 pts)
What is the accuracy of the naive (“chance” or “majority”) predictor? What does it predict?

2. (10 pts)

- Perform a k-means analysis with three clusters using the 21 features. Do the clusters correspond to the
“fetal health” variable? Comment on the result.
- Find the optimal number of clusters for k-means using silhouettes. Perform the corresponding clustering
and comment on the results.

3. (10 pts)

- Split the dataset into training and test sets (use set.seed in your code for reproducibility).
- Perform a thorough tree analysis using the training data.
- Evaluate the test accuracy of the obtained tree on the test dataset using a confusion matrix. Comment
on the results.

4. (25 pts)

- Perform a thorough random forest analysis using the training data and evaluate the test accuracy using a
confusion matrix. Comment on the results.
- Perform a thorough random forest analysis aiming to predict “Suspect or Pathological” vs. “Normal”,
using the training data (i.e., unite classes 2 and 3).
- Which are the two most important features according to this analysis?
- Describe qualitatively how these two most important features influence the probability of the case being
“Suspect or Pathological”.
- Evaluate the classifier with a ROC curve and comment on the result.
- Which threshold would you choose if you aim to predict correctly at least 95% of the cases that are
“Suspect or Pathological”?
MACHINE LEARNING FOR BIOENGINEERING
Exam, July 12, 2023: PART 2

In moodle you will find a dataset regarding migraine, denoted “migraine.dat”. The dataset consists of 400
cases with a response variable denoted “Type” that indicates the type of migraine that each of the subjects suffers
from. There are 7 different types of migraine in the data set. The aim is to predict “Type” from the other 18
features. There are no missing data.

Answers must be justified and explained. You can write your answers on sheets of paper, or in a document on
the computer to be send with code and figures.

>>> Send code and figures by email to [email protected]. <<<

WRITE YOUR NAME ON ALL SHEETS OF PAPER!

1. (5 pts)

- What is the accuracy of the naive (“chance” or “majority”) predictor? What does it predict?

2. (10 pts)

- Perform a hierarchical clustering analysis using the 18 features.

- Do the clusters that you have found correspond to the “Type” variable? Comment on the result.
- Perform a principal components analysis (PCA). Does this help you to see clusters? Compare to the
results for hierarchical clustering and comment.

3. (10 pts)

- Split the dataset into training and test sets (use set.seed in your code for reproducibility).
- Perform a thorough k nearest neighbor (k-NN) analysis using the training data.
- Evaluate the test accuracy of the obtained model on the test dataset using a confusion matrix. Comment
on the results.

4. (25 pts)

(a) Perform a thorough support vector machine (SVM) analysis using the training data and evaluate the
test accuracy using a confusion matrix. Comment on the results.
(b) Perform a thorough SVM analysis aiming to predict “Type = 6” vs. “Type ̸= 6”, using the training
data (i.e., unite classes 1-5 and 7). Compare to the SVM analysis performed in the previous question.
(c) Investigate (e.g., graphically) and describe qualitatively how the features Visual and Tinnitus are in-
volved in predicting whether a subject has a Type 6 migraine or not, according to this analysis. It may
be useful to set all other variables to their average value. Do you see any evidence for an interaction
between these two features (Visual and Tinnitus) in the SVM model? Investigate also how (some of)
the other variables influence the predictions of the model.
(d) Repeat the previous question for the analysis done for question (a).
MACHINE LEARNING FOR BIOENGINEERING
Exam, January 31, 2024: PART 2

In moodle you will find a dataset regarding cardiovascular disease, denoted “cvd.dat”. The dataset consists of
1316 cases with a response variable denoted “class” that indicates whether the patient had a heart attack (0: no,
1: yes) and 8 features. The aim is to predict “class” from the other features. There are no missing data.

Answers must be justified and explained. You can write your answers on sheets of paper, or in a document on
the computer to be send with code and figures.

>>> Send code and figures by email to [email protected]. <<<

WRITE YOUR NAME ON ALL SHEETS OF PAPER!

1. (5 pts)
What is the accuracy of the naive (“chance” or “majority”) predictor? What does it predict?

2. (10 pts)

- Split the dataset into training and test sets (use set.seed in your code for reproducibility).
- Perform a thorough k-nearest neighbor analysis using the training data.
- Evaluate the test accuracy of the obtained model on the test set using a confusion matrix. Comment on
the results.

3. (15 pts)

- Perform a thorough logistic regression analysis using the training data.

- Describe qualitatively how the most important features influence the risk of having a heart attack
(“if feature X increases, then the risk increases/decreases”).
- Evaluate the test accuracy of the final logistic model on the test set using a confusion matrix. Comment
on the results.
- Evaluate the classifier with a ROC curve and comment on the result.

4. (13 pts)

- Perform a thorough tree analysis using the training data.

- Describe qualitatively how the most important features influence the risk of having a heart attack. Does
it correspond to the logistic regression analysis?
- Evaluate the test accuracy of the obtained tree on the test dataset using a confusion matrix. Comment
on the results.

5. (12 pts)

- Which of the different classifiers studied above would you recommend? Why?
- Plot the two most important important features against each other using different colors according to the
value of “class”. Does the plot correspond to your analyses? Why/why not?
- Based on the plot, do you think a linear support vector machine would be a good classifier for this problem?
Why/why not?
MACHINE LEARNING FOR BIOENGINEERING
Exam, June 19, 2024: PART 2

In moodle you will find a dataset regarding thyroid disease, denoted “thyroid.dat”. The dataset consists of
383 cases with a response variable denoted “Recurred” that indicates whether the diseased recurred during a cer-
tain period, and 15 features. The aim is to predict “Recurred” from the other features. There are no missing data.

Answers must be justified and explained. You can write your answers on sheets of paper, or in a document on
the computer to be send with code and figures.

>>> Send code and figures by email to [email protected]. <<<

WRITE YOUR NAME ON ALL SHEETS OF PAPER!

1. (5 pts)

- Which class does the majority (also known as “chance” or “naive”) predictor predict for this dataset?
How accurate is it?

2. (20 pts)

- Split the dataset into training and test sets (use set.seed in your code for reproducibility).
- Perform a thorough random forest analysis using the training data and evaluate the test accuracy using a
confusion matrix. Comment on the results.
- Which are the most important features according to this analysis?
- Plot the predicted probabilities against each of two most important features for the subjects in the training
set. Comment on the plots to describe qualitatively how these two features influence the probability of
recurrence of the disease according to the model.
- Evaluate the classifier with a ROC curve and comment on the result.

3. (15 pts)

- Perform a thorough GBM analysis using the training data and evaluate the test accuracy using a confusion
matrix. Comment on the results.
- Evaluate the classifier with a ROC curve and comment on the result.
- Which are the most important features according to this analysis? Compare to your Random Forest
analysis.
- Make plots that show how the most important features influence your GBM model, and discuss whether
the overall conclusions concerning the qualitative effects of these features agree with the Random Forest
analysis.

4. (15 pts)

- Construct a logistic regression model using the training data and evaluate the test accuracy using a
confusion matrix. Comment on the results.
- Evaluate the logistic regression model with a ROC curve and comment on the result.
- Compare your logistic regression model to your conclusions concerning the role of the most important
features in the Random Forest and GBM classifiers.
MACHINE LEARNING FOR BIOENGINEERING
Exam, July 17, 2024: PART 2

In moodle you will find a dataset regarding breast cancer and denoted “breastcancer.dat”. The dataset consists
of geometrical features obtained from images of benign (B) and malignant (M) breast tumors, as given in the
response variable class. An identification number of each instance is also give in the column ID. There are no
missing data. The overall scope is to classify the tumors into one the two classes based on their geometrical features.

Answers must be justified and explained. You can write your answers on sheets of paper, or in a document on
the computer to be send with code and figures.

>>> Send code and figures by email to [email protected]. <<<

WRITE YOUR NAME ON ALL SHEETS OF PAPER!

1. (5 pts)

- Which class does the majority (also known as “chance” or “naive”) predictor predict for this dataset?
How accurate is it?

2. (20 pts)

- Perform a Principal Components Analysis (PCA), assuming that the true type of the class is unknown.
How many Principal Components (PCs) would you use?
- Investigate and illustrate (also graphically) how the first two PCs depend on the original features, and
comment on your results.
- Perform a k-means analysis with two clusters, assuming that the true class is unknown, and visualize the
result in the plane spanned by the first two PCs.
- Find the suggested number of clusters according to the WSS and the silhouette methods, and perform
k-means analysis accordingly.
- Do the clusters obtained in the previous questions correspond to the class variable? Comment on the
results.

3. (15 pts)

- Split the dataset into training and test sets (use set.seed in your code for reproducibility).
- Perform a thorough support vector machine (SVM) analysis using the training data and evaluate the test
accuracy using a confusion matrix. Comment on the results.
- Illustrate your final model by creating a figure showing the prediction boundary in the plane spanned
by the features concave points and area, and with the other features set to their average values
(calculated over the training set). Comment on the figure.
- Repeat the previous point but now for the features fractal dimension and area

4. (15 pts)

- Perform a thorough naive Bayes (NB) classification analysis using the training data and evaluate the test
accuracy using a confusion matrix. Comment on the results.
- Evaluate the classifier with a ROC curve and comment on the result.
- Discuss how the features concave points, area and fractal dimension distinguish between the two
classes according to the NB model. Compare to your PCA and SVM analyses.
MACHINE LEARNING FOR BIOENGINEERING
Exam, September 17, 2024: PART 2

In moodle you will find a dataset regarding ECG data and denoted “ecg.dat”. The dataset consists of 442
patients with a response variable denoted “class” that indicates whether the individual had arrhythmia according
to the doctors, and 197 features (5 “named” features: age, sex, height, weight, heart.rate; the remaining
192 features describe various features obtained from an ECG). There are no missing data.

WRITE YOUR NAME ON ALL SHEETS OF PAPER!

1. (5 pts)

- Which class does the majority (also known as “chance” or “naive”) classifier predict for this dataset? How
accurate is it?

2. (12 pts)

- Perform hierachical clustering, excluding class and sex from the analysis. Aim at getting two clusters of
comparable size.
- Evaluate the goodness of the clustering with silhouttes and comment on the result.
- Do the clusters obtained correspond to the class variable? Comment on the result.

3. (12 pts)

- Split the dataset in training and test sets (use set.seed in your code for reproducibility)..
- Perform a thorough classification tree analysis predicting class from the other features using the training
data. Evaluate the obtained tree model on the test data with a confusion matrix. Comment on the
results, in particular if any “named” features (see above) appear in your tree.
- Comment on your results, possibly using plots of versus the basic features to help your reasoning.

4. (8 pts)

- Perform a thorough k-nearest neighbor analysis, excluding the feature sex, to predict the class. Evaluate
the classifier with LOOCV.

5. (18 pts)

- Perform a thorough random forest analysis predicting class from the other features using the training
data and evaluate on the test data with a confusion matrix. Comment on the results.
- Which are the most important features according to this analysis?
- To understand how heart.rate influences the model, predict the probability of arrhythmia using your
final random forest model on the training set, and plot these predicted probabilities against heart.rate
for the subject in the training data. Comment on the result.
- Evaluate the classifier with a ROC curve and comment on the result.
MACHINE LEARNING FOR BIOENGINEERING
Exam, January 30, 2025: PART 2

In moodle you will find a dataset regarding heart disease, denoted “heart.dat”. The dataset consists of 1316
cases with a response variable denoted “cl” that indicates whether the patient had a heart attack and 8 features.
The aim is to predict “cl” from the other features. There are no missing data.

Answers must be justified and explained. You can write your answers on sheets of paper, or in a document on
the computer to be send with code and figures.

>>> Send code and figures by email to [email protected]. <<<

WRITE YOUR NAME ON ALL SHEETS OF PAPER!

1. (5 pts)

- Which class does the majority (also known as “chance” or “naive”) predictor predict for this dataset?
How accurate is it?

2. (15 pts)

- Split the dataset into training and test sets (use set.seed in your code for reproducibility).
- Perform a thorough GBM analysis using the training data and evaluate the test accuracy using a confusion
matrix. Comment on the results.
- Evaluate the classifier with a ROC curve and comment on the result.
- Which are the most important features according to this analysis?
- Make plots that show how the two most important features influence your GBM model, both separately
and in combination, and comment on the figures.

3. (12 pts)

- Perform a thorough naive Bayes (NB) classification analysis using the training data and evaluate the test
accuracy using a confusion matrix. Comment on the results.
- Evaluate the classifier with a ROC curve and comment on the result.
- Discuss which of the features distinguish best between the two classes according to the NB model, e.g.,
using appropriate plots. Compare to your GBM analyses.

4. (18 pts)

- Perform a thorough support vector machine (SVM) analysis using the training data and evaluate the test
accuracy using a confusion matrix. Comment on the results.
- Illustrate your final model by creating a figure showing the prediction boundary in the plane spanned by
the features logtrop and logkcm, and with the other features set to their average values (calculated
over the training set).
- Repeat the previous point but now for the best linear SVM.
- Comment on these two SVM figures and compare to the results obtained for the GBM and NB analyses.

INF8953CE Final Exam Questions 2020
No ratings yet
INF8953CE Final Exam Questions 2020
5 pages
Midterm 2006
No ratings yet
Midterm 2006
11 pages
ML End Sem Nov2024 Paper
No ratings yet
ML End Sem Nov2024 Paper
4 pages
2023-24 AIML ML Mid-Semester Make-Up Answer-Keys
No ratings yet
2023-24 AIML ML Mid-Semester Make-Up Answer-Keys
6 pages
Disease Prediction ML Assignment
No ratings yet
Disease Prediction ML Assignment
7 pages
ML Week7 Soln
No ratings yet
ML Week7 Soln
3 pages
Draft Reasearch Paper
No ratings yet
Draft Reasearch Paper
3 pages
Final 2019
No ratings yet
Final 2019
15 pages
HarvardX Data Science ML Assessments
100% (1)
HarvardX Data Science ML Assessments
74 pages
BDS 2020-21
No ratings yet
BDS 2020-21
5 pages
hw5 1
No ratings yet
hw5 1
6 pages
COL 774 - Machine Learning - Assignment 5
No ratings yet
COL 774 - Machine Learning - Assignment 5
6 pages
Wa0030.
No ratings yet
Wa0030.
36 pages
Assignment 1
No ratings yet
Assignment 1
2 pages
Col774 A5
No ratings yet
Col774 A5
6 pages
Cs 419 Endsemsols
No ratings yet
Cs 419 Endsemsols
6 pages
Disease Prediction via Symptoms Dataset
No ratings yet
Disease Prediction via Symptoms Dataset
7 pages
ML Midsem 2018 Solutions
No ratings yet
ML Midsem 2018 Solutions
7 pages
Final2019 Solutions
No ratings yet
Final2019 Solutions
23 pages
2023-24 AIML ML Mid-Semester Regular QP Anwer-Keys
No ratings yet
2023-24 AIML ML Mid-Semester Regular QP Anwer-Keys
4 pages
Midterm 2002
No ratings yet
Midterm 2002
10 pages
FYMCA IDSLab A6 Submission
No ratings yet
FYMCA IDSLab A6 Submission
9 pages
Sample QP For Mid-Semester Exam
No ratings yet
Sample QP For Mid-Semester Exam
5 pages
Assignment 1 Solution
No ratings yet
Assignment 1 Solution
6 pages
Assignment 2 Solution (Revised)
No ratings yet
Assignment 2 Solution (Revised)
6 pages
Exam 2017
No ratings yet
Exam 2017
8 pages
DS4420 Coding Midterm
No ratings yet
DS4420 Coding Midterm
5 pages
Machine Learning Assignment 1 Basic Concepts: Due: 27 March 2015, 15:00pm
No ratings yet
Machine Learning Assignment 1 Basic Concepts: Due: 27 March 2015, 15:00pm
3 pages
Second Progres Report
No ratings yet
Second Progres Report
10 pages
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
No ratings yet
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
11 pages
t4 Sol
No ratings yet
t4 Sol
8 pages
S&UL Subjective Question Bank
No ratings yet
S&UL Subjective Question Bank
7 pages
Machine Learning Foundations and Applications Assignment 1 Due Date: 10 October, 2021
No ratings yet
Machine Learning Foundations and Applications Assignment 1 Due Date: 10 October, 2021
3 pages
Assignment III
No ratings yet
Assignment III
3 pages
ML Questions Paper
No ratings yet
ML Questions Paper
6 pages
Machine Learning
No ratings yet
Machine Learning
4 pages
Final 2006
No ratings yet
Final 2006
15 pages
DMS5
No ratings yet
DMS5
1 page
AI and ML Lab Manual
No ratings yet
AI and ML Lab Manual
29 pages
ML Assignment
No ratings yet
ML Assignment
3 pages
MachineLearning MidTerm UMT Spring 2021
100% (1)
MachineLearning MidTerm UMT Spring 2021
12 pages
ML - Compre - Question - Paper - 2022 - 23 - Marking Scheme
No ratings yet
ML - Compre - Question - Paper - 2022 - 23 - Marking Scheme
6 pages
Program 7
100% (1)
Program 7
4 pages
Neural Networks & Python Quiz
No ratings yet
Neural Networks & Python Quiz
31 pages
Data Analysis and Machine Learning Lab Questions
No ratings yet
Data Analysis and Machine Learning Lab Questions
9 pages
Decision Tree & SVM in Machine Learning
No ratings yet
Decision Tree & SVM in Machine Learning
2 pages
72 Report
No ratings yet
72 Report
9 pages
A1 CCS345 - Ethics and AI - Rubric
No ratings yet
A1 CCS345 - Ethics and AI - Rubric
3 pages
Assignment 1
No ratings yet
Assignment 1
6 pages
ML Midsem 2022
No ratings yet
ML Midsem 2022
8 pages
Computer Lab 2 Block 1-3
No ratings yet
Computer Lab 2 Block 1-3
7 pages
Homework 3
No ratings yet
Homework 3
10 pages
ML Question CMU
No ratings yet
ML Question CMU
12 pages
Usl 70 Marks Set 1
No ratings yet
Usl 70 Marks Set 1
2 pages
Solution of Final Exam: 10-701/15-781 Machine Learning: Fall 2004 Dec. 12th 2004
No ratings yet
Solution of Final Exam: 10-701/15-781 Machine Learning: Fall 2004 Dec. 12th 2004
27 pages
ML File
No ratings yet
ML File
7 pages
Bayesian Decision Theory Quiz
No ratings yet
Bayesian Decision Theory Quiz
6 pages
Final 2018
No ratings yet
Final 2018
15 pages
Ai - ML 1
No ratings yet
Ai - ML 1
7 pages
Lecture 1
100% (1)
Lecture 1
43 pages
Unit 4 Ai
No ratings yet
Unit 4 Ai
13 pages
Hierarchical
No ratings yet
Hierarchical
31 pages
Clustering Analysis
No ratings yet
Clustering Analysis
112 pages
Reading 3 Machine Learning - Answers
No ratings yet
Reading 3 Machine Learning - Answers
11 pages
DNA and RNA Structure FAQs
No ratings yet
DNA and RNA Structure FAQs
11,493 pages
IoT Node Clustering for Enhanced Routing
No ratings yet
IoT Node Clustering for Enhanced Routing
24 pages
Dendrogram - Slides
No ratings yet
Dendrogram - Slides
27 pages
Machine Learning Viva Questions With Answers
No ratings yet
Machine Learning Viva Questions With Answers
5 pages
Cluster Analysis for Marketers
No ratings yet
Cluster Analysis for Marketers
35 pages
Advanced Data Bases and Miningquestion Bank
No ratings yet
Advanced Data Bases and Miningquestion Bank
2 pages
Regression & Clustering Quiz
No ratings yet
Regression & Clustering Quiz
4 pages
ML Q
No ratings yet
ML Q
40 pages
Strohmeier 2014
No ratings yet
Strohmeier 2014
21 pages
Data Science From Research To Application
No ratings yet
Data Science From Research To Application
350 pages
AI Project Cycle MCQs for Class 10
No ratings yet
AI Project Cycle MCQs for Class 10
17 pages
1 s2.0 S1359644621005043 Main
No ratings yet
1 s2.0 S1359644621005043 Main
18 pages
B.Tech AI & DS Syllabus 2021-22
No ratings yet
B.Tech AI & DS Syllabus 2021-22
203 pages
ASTER Accurately Estimating The Number of Cell Typ
No ratings yet
ASTER Accurately Estimating The Number of Cell Typ
3 pages
Presentation Data Mining
No ratings yet
Presentation Data Mining
22 pages
Unsupervised Learning Review
No ratings yet
Unsupervised Learning Review
12 pages
16 Important Data Science Papers
No ratings yet
16 Important Data Science Papers
248 pages
AI Tools
No ratings yet
AI Tools
16 pages
Data Science: Unsupervised Learning
No ratings yet
Data Science: Unsupervised Learning
49 pages
5 - 2024 - Artificial Intelligence, Internet of Things and 6G Methodologies in The Context of Vehicular Ad-Hoc Networks (VANETs)
No ratings yet
5 - 2024 - Artificial Intelligence, Internet of Things and 6G Methodologies in The Context of Vehicular Ad-Hoc Networks (VANETs)
22 pages
Evaluation of Solar Energy Generation and Radiation Prediction Using Machine Learning
No ratings yet
Evaluation of Solar Energy Generation and Radiation Prediction Using Machine Learning
5 pages
Session 2 Intro AI ML ITiE
No ratings yet
Session 2 Intro AI ML ITiE
23 pages
Unit 1 ML
No ratings yet
Unit 1 ML
17 pages
Aids - 21ad62 - Datascience Lab Manual-1
No ratings yet
Aids - 21ad62 - Datascience Lab Manual-1
15 pages

Esami - R UNIPD

Uploaded by

Esami - R UNIPD

Uploaded by

MACHINE LEARNING FOR BIOENGINEERING

Exam, February 10, 2023: PART 2 (120 min)

WRITE YOUR NAME ON ALL SHEETS OF PAPER!

- Perform a classification tree analysis of the dataset.

WRITE YOUR NAME ON ALL SHEETS OF PAPER!

WRITE YOUR NAME ON ALL SHEETS OF PAPER!

In moodle you will find a dataset regarding breastcancer, denoted “nki5y.dat”.

>>> Send code and figures by email to [email protected]. <<<

WRITE YOUR NAME ON ALL SHEETS OF PAPER!

WRITE YOUR NAME ON ALL SHEETS OF PAPER!

- Perform a Random Forest (RF) analysis of the dataset.

>>> Send code and figures by email to [email protected]. <<<

WRITE YOUR NAME ON ALL SHEETS OF PAPER!

>>> Send code and figures by email to [email protected]. <<<

WRITE YOUR NAME ON ALL SHEETS OF PAPER!

- Perform a hierarchical clustering analysis using the 18 features.

>>> Send code and figures by email to [email protected]. <<<

WRITE YOUR NAME ON ALL SHEETS OF PAPER!

- Perform a thorough logistic regression analysis using the training data.

- Perform a thorough tree analysis using the training data.

>>> Send code and figures by email to [email protected]. <<<

WRITE YOUR NAME ON ALL SHEETS OF PAPER!

>>> Send code and figures by email to [email protected]. <<<

WRITE YOUR NAME ON ALL SHEETS OF PAPER!

WRITE YOUR NAME ON ALL SHEETS OF PAPER!

>>> Send code and figures by email to [email protected]. <<<

WRITE YOUR NAME ON ALL SHEETS OF PAPER!

You might also like