0% found this document useful (0 votes)

29 views5 pages

Data Modeling Project Overview

The Data Modeling project for the DS3122 course involves a group of 4-5 students and accounts for 25% of the overall assessment. Students will gain hands-on experience in building deep learning models for image classification, focusing on high-variability datasets, and will be required to submit a report, a poster, and deliver a final presentation. The project includes tasks such as dataset selection, data preprocessing, model building, evaluation, and reporting, with specific guidelines and examples provided for successful completion.

Uploaded by

mohamed sayed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views5 pages

Data Modeling Project Overview

Uploaded by

mohamed sayed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

DS3122: Data Modeling

Year: 1446/1447, Semester-I

Department of Data Science
‫ﻛﻠﻴﺔ اﻟﺤﺎﺳﺒﺎت‬
COLLAGE OF COMPUTING

Data Modeling Project

General Notes
• This course project contributes to 25% of the overall course assessment.
• This project should be carried out by a group of 4-5 students.
• This project must be submitted via Blackboard.

Learning Outcomes:

The aim of this project is to provide students with hands-on experience in building deep
learning models for image classification problems, particularly those involving datasets with
high variability. By the end of the project, students will:

1. Gain practical experience in training and evaluating deep learning models using
high-variability image datasets.
2. Deepen knowledge of deep learning techniques: specifically CNNs, RNNs (for
sequential image-related tasks), and feedforward neural networks. Students will also
compare these methods with traditional machine learning approaches.
3. Tackle complex classification challenges involving diverse, real-world data,
focusing on extracting patterns and improving model accuracy.

Requirements:

1. Dataset Selection:
o Choose a dataset that involves image classification with high variability.
Datasets with diverse image categories or complex real-world scenes are
recommended. Example datasets can be found on platforms such as:
§ Google Dataset Search
§ Kaggle
§ UCI ML Repository
§ COCO, ImageNet, CIFAR-100, etc.
2. Data Preprocessing:
o Ensure appropriate data preprocessing techniques are applied. This includes
handling missing or corrupted data, data augmentation (such as rotation,
flipping, or scaling of images), normalization, and other preprocessing steps
like resizing images to a uniform format.
3. Model Building:
o Apply at least two deep learning architectures (e.g., CNNs, RNNs, or
feedforward neural networks) for image classification.
o Compare the performance of deep learning techniques with classical machine
learning methods (e.g., decision trees, SVM) to demonstrate why deep
learning performs better on high-variability datasets.

1
DS3122: Data Modeling
Year: 1446/1447, Semester-I
Department of Data Science
‫ﻛﻠﻴﺔ اﻟﺤﺎﺳﺒﺎت‬
COLLAGE OF COMPUTING

4. Evaluation:
o Use appropriate evaluation metrics like accuracy, precision, recall, F1 score,
or confusion matrix. Where applicable, track performance through
visualizations (e.g., learning curves, confusion matrices).
o Justify the choice of evaluation metrics based on the project goals.
5. Report Structure: Your report should include the following sections:
o Abstract: A brief paragraph (500 words max) summarizing the project goals,
dataset used, approach, and key findings.
o Introduction: Introduce the image classification problem, dataset, and why
it's a challenging classification task due to variability. Describe the approach
and methodology (half a page to one page).
o Data Exploration and Description:
§ Provide a detailed description of the dataset, including the number of
classes, types of images, variability, and any preprocessing challenges
faced (such as noise, missing data, or imbalance). Include relevant
visualizations (e.g., sample images, class distribution plots).
o Methodology:
§ Describe the data preprocessing techniques, deep learning models used
(e.g., CNNs, RNNs), evaluation metrics, and experimental setup.
Include hyperparameters, training epochs, and data augmentation
techniques, if any.
§ Justify your choice of models and their suitability for high variability
datasets.
o Patterns Discovery and Results:
§ Present results from the deep learning models. Include performance
metrics, learning curves, and visualizations like accuracy and loss
graphs, confusion matrices, etc.
§ Compare the results of deep learning models against classical machine
learning models (e.g., Random Forest, SVM), explaining the benefits
and limitations of each method.
o Conclusion:
§ Summarize your findings, highlight the challenges of dealing with high
variability, and discuss potential improvements or future work, such as
fine-tuning models or applying transfer learning.
o References:
§ Follow the IEEE referencing style.

Tasks:

1. Task 1: To Submit A Poster for the Exhibition Next Week 8th Of October 2024.
2. Task 2: Submit the report
3. Task 3: Final Presentation

2
DS3122: Data Modeling
Year: 1446/1447, Semester-I
Department of Data Science
‫ﻛﻠﻴﺔ اﻟﺤﺎﺳﺒﺎت‬
COLLAGE OF COMPUTING

Sample Image Classification Projects

1. ImageNet

• Description: One of the largest and most popular image classification datasets, with
over 14 million labeled images across 1,000 categories.
• Link: [Link]

2. CIFAR-10

• Description: Contains 60,000 32x32 color images in 10 classes (e.g., airplanes, birds,
cats, and dogs), with 6,000 images per class.
• Link: [Link]

3. Emotion Recognition from Facial Images

• Objective: Build a CNN-based model to classify emotions (e.g., happy, sad, angry,
surprised) from images of faces.
• Dataset: [Link]

4. Garbage Classification for Recycling

• Objective: Classify images of waste (plastic, paper, metal, etc.) to aid in automated
recycling processes.
• Dataset: [Link]

5. MNIST (Modified National Institute of Standards and Technology)

• Description: A well-known dataset containing 70,000 grayscale images of

handwritten digits (0-9), with each image being 28x28 pixels.
• Link: [Link]

3
DS3122: Data Modeling
Year: 1446/1447, Semester-I
Department of Data Science
‫ﻛﻠﻴﺔ اﻟﺤﺎﺳﺒﺎت‬
COLLAGE OF COMPUTING

6. Fashion MNIST

• Description: A dataset similar to MNIST but with 70,000 grayscale images of

fashion items (e.g., shoes, shirts, bags) instead of digits, aimed at more complex
classification tasks.
• Link: Fashion MNIST

7. Animal Species Identification

• Objective: Classify different species of animals (e.g., lions, tigers, elephants, birds)
using image datasets.
• Dataset: iNaturalist or Caltech-UCSD Birds 200

8. Flowers Recognition

• Description: A dataset containing images of flowers from 5 different species. It’s

commonly used for fine-grained image classification tasks.
• Link: Flowers Dataset

9. Plant Disease Detection

• Objective: Classify plant diseases based on leaf images to assist farmers in

diagnosing diseases.
• Dataset: [Link]

10. Stanford Cars

• Description: Contains 16,185 images of 196 classes of cars. This dataset is often used
for fine-grained classification of car models.
• Link: Stanford Cars Dataset

4
DS3122: Data Modeling
Year: 1446/1447, Semester-I
Department of Data Science
‫ﻛﻠﻴﺔ اﻟﺤﺎﺳﺒﺎت‬
COLLAGE OF COMPUTING

11. Food-101

• Description: Contains 101,000 images of food items from 101 categories. Each class
contains 1,000 images, and the dataset is designed for fine-grained food classification
tasks.
• Link: [Link]
•

12. CelebA (CelebFaces Attributes)

• Description: A large-scale facial attributes dataset with more than 200,000 celebrity
images, each annotated with 40 attributes such as age, gender, and expression.
• Link: [Link]

13. Caltech-256

• Description: Contains 30,607 images spanning 256 object categories, making it ideal
for object recognition tasks.
• Link: [Link]

14. Medical Image Classification

• Objective: Classify medical images into categories like disease/no disease or

different types of medical conditions (e.g., skin cancer, pneumonia).
• Dataset: [Link]
•

15. Age and Gender Classification

• Objective: Build a CNN model that can predict the age and gender of a person from
an image.

Dataset: [Link]

Jadhav CV Short
No ratings yet
Jadhav CV Short
1 page
Datascience
No ratings yet
Datascience
7 pages
Image Recognition Using Machine Learning Research Paper
No ratings yet
Image Recognition Using Machine Learning Research Paper
5 pages
CN7023 Coursework T3 24-25
No ratings yet
CN7023 Coursework T3 24-25
5 pages
Implementation of ML Model For Image Classification
No ratings yet
Implementation of ML Model For Image Classification
19 pages
Python Projects for Learners
No ratings yet
Python Projects for Learners
9 pages
Computer Vision Report
No ratings yet
Computer Vision Report
2 pages
4 2final
No ratings yet
4 2final
34 pages
CN7023 Coursework T2 2425
No ratings yet
CN7023 Coursework T2 2425
5 pages
Deep Learning for Flower Recognition
No ratings yet
Deep Learning for Flower Recognition
87 pages
Project Ideas
No ratings yet
Project Ideas
5 pages
NewSyllabus 1157202352913185
No ratings yet
NewSyllabus 1157202352913185
7 pages
Data Science Project Ideas for SEO
No ratings yet
Data Science Project Ideas for SEO
2 pages
الترجمة والملف الاصلي
No ratings yet
الترجمة والملف الاصلي
23 pages
Adnan Internship
No ratings yet
Adnan Internship
15 pages
Image Recognition with CNNs and Keras
No ratings yet
Image Recognition with CNNs and Keras
5 pages
AI Image Classification With Neural Network 221002564 222002074
No ratings yet
AI Image Classification With Neural Network 221002564 222002074
17 pages
Bao Cao BTL Python
No ratings yet
Bao Cao BTL Python
28 pages
cv1 1730466079
No ratings yet
cv1 1730466079
2 pages
Synopsis
No ratings yet
Synopsis
51 pages
Harsha Thesis
No ratings yet
Harsha Thesis
62 pages
C1 Introduction 1 C1 IML
No ratings yet
C1 Introduction 1 C1 IML
9 pages
AIML 2nd Year
No ratings yet
AIML 2nd Year
5 pages
Essential Data Science Projects Guide
No ratings yet
Essential Data Science Projects Guide
1 page
Jatin Shinde ANN MINIPROJECT
No ratings yet
Jatin Shinde ANN MINIPROJECT
13 pages
Deep Learning Projects
No ratings yet
Deep Learning Projects
13 pages
REPORT-Sayantan Nandy-PW20SJ02 - ProjectReport (24-04-20) PDF
No ratings yet
REPORT-Sayantan Nandy-PW20SJ02 - ProjectReport (24-04-20) PDF
25 pages
AI - ML Project Ideas For Final Year
No ratings yet
AI - ML Project Ideas For Final Year
41 pages
M Shahab 1710305075121
No ratings yet
M Shahab 1710305075121
21 pages
Final Memory
No ratings yet
Final Memory
83 pages
20 Beginner Machine Learning Projects
No ratings yet
20 Beginner Machine Learning Projects
4 pages
Deep Learning Lab Miniproject
No ratings yet
Deep Learning Lab Miniproject
9 pages
Course File (1) - Removed
No ratings yet
Course File (1) - Removed
5 pages
Citrus Disease Detection System
No ratings yet
Citrus Disease Detection System
53 pages
Yash Resume 4
No ratings yet
Yash Resume 4
1 page
Image Recognition with Neural Networks
No ratings yet
Image Recognition with Neural Networks
60 pages
DIP Mini Project
100% (1)
DIP Mini Project
12 pages
Vishal Minor Project 2
No ratings yet
Vishal Minor Project 2
16 pages
Auto Recovered
No ratings yet
Auto Recovered
2 pages
Resume 1734199998
No ratings yet
Resume 1734199998
1 page
Image Report-1
No ratings yet
Image Report-1
21 pages
Machine Learning Guide
No ratings yet
Machine Learning Guide
10 pages
Deep Image Search Project
No ratings yet
Deep Image Search Project
13 pages
Customer Segmentation 2
No ratings yet
Customer Segmentation 2
19 pages
Image Classification System Development
No ratings yet
Image Classification System Development
21 pages
Data Science Resume: Tarun Chauhan
No ratings yet
Data Science Resume: Tarun Chauhan
1 page
Projects For Ai
No ratings yet
Projects For Ai
8 pages
Lavajiit Singh CV
No ratings yet
Lavajiit Singh CV
3 pages
Memoona Basharat: Career Objective
No ratings yet
Memoona Basharat: Career Objective
2 pages
Image Classification of Animals
No ratings yet
Image Classification of Animals
4 pages
Output
No ratings yet
Output
1 page
Session 02 Practical Approach For AIML Projects
No ratings yet
Session 02 Practical Approach For AIML Projects
62 pages
Data Mining & Machine Learning Courseoutline
No ratings yet
Data Mining & Machine Learning Courseoutline
7 pages
Project Submission Guidelines For PGDM Students
No ratings yet
Project Submission Guidelines For PGDM Students
4 pages
CIS 6213 Applied Machine Learning Coursework
No ratings yet
CIS 6213 Applied Machine Learning Coursework
5 pages
AI Practical File Expanded
No ratings yet
AI Practical File Expanded
41 pages
Intro Ai Group3
No ratings yet
Intro Ai Group3
35 pages
Project Report - Intro To AI
No ratings yet
Project Report - Intro To AI
40 pages
Final Yatin S5 Int
No ratings yet
Final Yatin S5 Int
114 pages
Morphology and Word Formation in English
100% (1)
Morphology and Word Formation in English
22 pages
Understanding Interrupt Handlers
No ratings yet
Understanding Interrupt Handlers
3 pages
TCS - CodeVita - Coding Arena-E
No ratings yet
TCS - CodeVita - Coding Arena-E
3 pages
Chandan Barik: B.Tech CSE Profile
No ratings yet
Chandan Barik: B.Tech CSE Profile
1 page
TPS Control Guide
No ratings yet
TPS Control Guide
48 pages
Methodology of System Dynamics
No ratings yet
Methodology of System Dynamics
10 pages
Electronics Lab Manual
No ratings yet
Electronics Lab Manual
24 pages
Interconnection Standards
No ratings yet
Interconnection Standards
14 pages
Lenovo's Relationship Marketing
No ratings yet
Lenovo's Relationship Marketing
25 pages
Importance of Business Computing
No ratings yet
Importance of Business Computing
16 pages
SIP Soumik Bhattacharya
No ratings yet
SIP Soumik Bhattacharya
93 pages
Standardization and Modularity in Data Center Physical Infrastructure
No ratings yet
Standardization and Modularity in Data Center Physical Infrastructure
17 pages
Olympiad 202425
No ratings yet
Olympiad 202425
7 pages
B.2 Video Presentations Details
No ratings yet
B.2 Video Presentations Details
7 pages
Can-Could-Will Be Able To
100% (1)
Can-Could-Will Be Able To
2 pages
Digital Twin Solutions for Asset Management
No ratings yet
Digital Twin Solutions for Asset Management
2 pages
SAP Transactions
No ratings yet
SAP Transactions
30 pages
Photography Website Project Report
No ratings yet
Photography Website Project Report
3 pages
Java Collection Interview Q&A
No ratings yet
Java Collection Interview Q&A
14 pages
Implement 16 Bit Division Circuit Using Data Flow Architecture
No ratings yet
Implement 16 Bit Division Circuit Using Data Flow Architecture
7 pages
Facebook.com
No ratings yet
Facebook.com
208 pages
Et200sp Cpu1510sp 1 PN Manual en-US en-US
No ratings yet
Et200sp Cpu1510sp 1 PN Manual en-US en-US
38 pages
Gek 113537J
No ratings yet
Gek 113537J
302 pages
SSP Sode
No ratings yet
SSP Sode
1 page
Pretest 20251013
No ratings yet
Pretest 20251013
8 pages
Notes - OSCM - Chapter 1
No ratings yet
Notes - OSCM - Chapter 1
7 pages
National Board For Technical Education National Diploma (ND) IN Computer Science Curriculum and Course Specifications November 2004
No ratings yet
National Board For Technical Education National Diploma (ND) IN Computer Science Curriculum and Course Specifications November 2004
238 pages
2023 - VIT - EXTC - NEP Scheme & Second Year Syllabus - B Tech-May 2024
No ratings yet
2023 - VIT - EXTC - NEP Scheme & Second Year Syllabus - B Tech-May 2024
69 pages
N.Mahendra Nath Reddy: Profile
No ratings yet
N.Mahendra Nath Reddy: Profile
2 pages
Ict Year 8 End Term (12 Copies)
No ratings yet
Ict Year 8 End Term (12 Copies)
8 pages

Data Modeling Project Overview

Uploaded by

Data Modeling Project Overview

Uploaded by

DS3122: Data Modeling

Year: 1446/1447, Semester-I

Data Modeling Project

Sample Image Classification Projects

3. Emotion Recognition from Facial Images

4. Garbage Classification for Recycling

5. MNIST (Modified National Institute of Standards and Technology)

• Description: A well-known dataset containing 70,000 grayscale images of

• Description: A dataset similar to MNIST but with 70,000 grayscale images of

7. Animal Species Identification

• Description: A dataset containing images of flowers from 5 different species. It’s

9. Plant Disease Detection

• Objective: Classify plant diseases based on leaf images to assist farmers in

10. Stanford Cars

12. CelebA (CelebFaces Attributes)

14. Medical Image Classification

• Objective: Classify medical images into categories like disease/no disease or

15. Age and Gender Classification

You might also like