0% found this document useful (0 votes)

11 views9 pages

ImageEntity Extractor

Uploaded by

cit.dms1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views9 pages

ImageEntity Extractor

Uploaded by

cit.dms1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

ImageEntity Extractor:

Automated Extraction of Product

Attributes from Images Using
Machine Learning

Samarth Shinde
18 September 2024

ImageEntity Extractor 1
Table of Contents

1. Executive Summary
2. Introduction
3. Problem Statement
4. Objectives
5. Methodology
• Data Collection
• Data Preprocessing
• Model Architecture
• Training Process
• Evaluation Metrics
6. Results
7. Challenges and Solutions
8. Conclusion
9. Future Work
10. References

ImageEntity Extractor 2
Executive Summary

The ImageEntity Extractor project aims to automate the

extraction of key product attributes from images, addressing
the challenge of limited textual descriptions in digital
marketplaces. By leveraging Optical Character Recognition
(OCR) and advanced machine learning models, the project
successfully extracted vital information such as weight,
volume, voltage, wattage, and dimensions from product
images. Achieving an F1 score of 0.289 and ranking 660 out
of 1,980 participating teams in the hackathon, this project
demonstrates the potential of integrating computer vision and
natural language processing to enhance data accuracy and
ef ciency in e-commerce platforms.

ImageEntity Extractor 3
fi
Introduction

In the rapidly expanding digital marketplace, accurate and detailed

product information is paramount for consumer trust and informed
decision-making. However, many products lack comprehensive
textual descriptions, relying solely on images. Extracting key attributes
directly from images can bridge this information gap, enhancing
product listings and improving user experience. This project explores
the application of machine learning techniques to automate the
extraction of such attributes, providing a scalable solution for large e-
commerce platforms.

Problem Statement

Digital marketplaces often face the issue of incomplete or insuf cient

product descriptions, relying heavily on images that may not convey
all necessary details. Essential attributes like weight, volume, voltage,
wattage, and dimensions are critical for consumers to make informed
purchases but are frequently absent in textual form. Manually
annotating these attributes is time-consuming and prone to errors.
Therefore, there is a need for an automated system that can
accurately extract these entity values from product images.

ImageEntity Extractor 4
fi
Objectives

• Automate Extraction: Develop a machine learning model capable

of extracting speci c product attributes from images.
• Enhance Data Accuracy: Improve the accuracy and consistency of
product information in digital marketplaces.
• Scalability: Create a scalable solution that can handle large
volumes of images with varying qualities and formats.
• Ef ciency: Reduce the time and resources required for manual
data annotation.

Methodology

Data Collection

The dataset comprised product images along with corresponding CSV

les containing index, image_link, group_id, entity_name, and
entity_value. The training dataset included labeled entity values, while
the test dataset provided images without these labels for prediction.

Data Preprocessing

1. Image Downloading: Utilized the download_images.py script to

download images from provided URLs, organizing them into
images/train and images/test directories.
2. OCR Processing: Employed Tesseract OCR via the pytesseract
library to extract textual information from images.

ImageEntity Extractor 5
fi
fi
fi
3. Data Cleaning: Processed the OCR outputs to remove noise and
irrelevant text, ensuring only pertinent data was used for training.
4. Feature Engineering: Combined extracted text with entity names
to create input features for the model.

Model Architecture

Implemented a Transformer-based model using Hugging Face’s T5

architecture. The choice of T5 was due to its versatility in handling
text-to-text tasks, making it suitable for mapping extracted text to
speci c entity values.

Training Process

1. Tokenization: Used T5Tokenizer to tokenize input and target

texts, ensuring uniform input sizes.
2. Dataset Preparation: Created a custom PyTorch Dataset class to
handle input-output pairs, facilitating ef cient data loading.
3. Training Loop: Trained the model using the MPS backend on a
MacBook M1 GPU, optimizing with AdamW optimizer and a linear
learning rate scheduler.
4. Validation: Monitored performance using the F1 score to evaluate
precision and recall of the model’s predictions.

Evaluation Metrics

The primary metric for evaluation was the F1 score, balancing

precision and recall to provide a comprehensive measure of the
model’s accuracy in extracting entity values.

ImageEntity Extractor 6
fi
fi
Results

The ImageEntity Extractor achieved an F1 score of 0.289,

securing the 660th rank out of 1,980 participating teams in the
hackathon. While there is room for improvement, this result
demonstrates the feasibility of using machine learning for
automated entity extraction from images. The model effectively
identi ed and extracted key attributes, though further
enhancements in data preprocessing and model architecture
could yield higher accuracy.

Challenges and Solutions

• Data Quality: Variations in image quality and text readability posed

signi cant challenges. To mitigate this, extensive data cleaning and
augmentation techniques were employed to enhance OCR
accuracy.
• Model Performance: Achieving a higher F1 score required ne-
tuning the Transformer model and experimenting with different
architectures. Future iterations may explore larger models or
ensemble techniques.
• Resource Constraints: Training on a MacBook M1 limited
computational resources. Optimizing code and leveraging ef cient
libraries helped maximize performance within these constraints.

ImageEntity Extractor 7
fi
fi
fi
fi
Conclusion

The ImageEntity Extractor successfully demonstrated the potential

of machine learning in automating the extraction of product attributes
from images. Despite achieving a moderate F1 score, the project laid
a strong foundation for further enhancements. By addressing data
quality issues and optimizing model architectures, future work can
signi cantly improve accuracy, making this solution highly valuable for
e-commerce platforms seeking to enrich product information
ef ciently.

Future Work

• Model Optimization: Explore more advanced Transformer

architectures or incorporate pre-trained models specialized in OCR
tasks.
• Data Augmentation: Implement more sophisticated data
augmentation techniques to enhance model robustness against
varying image qualities.
• Multi-Attribute Extraction: Extend the model to handle multi-
attribute extraction simultaneously, improving ef ciency and
scalability.
• Deployment: Develop a deployment pipeline to integrate the model
into live e-commerce platforms, enabling real-time attribute
extraction.

ImageEntity Extractor 8
fi
fi
fi
References

• Vaswani, A., et al. (2017). “Attention is All You Need.” Advances in

Neural Information Processing Systems.
• Tesseract OCR Documentation. Retrieved from https://github.com/
tesseract-ocr/tesseract
• Hugging Face Transformers. Retrieved from https://huggingface.co/
transformers/
• PyTorch Documentation. Retrieved from https://pytorch.org/docs/
stable/index.html

ImageEntity Extractor 9

Documentation ML
No ratings yet
Documentation ML
10 pages
Minor 2
No ratings yet
Minor 2
4 pages
Information Extraction From Product Labels: A Machine Vision Approach
No ratings yet
Information Extraction From Product Labels: A Machine Vision Approach
20 pages
Generative AI Mini Projects
No ratings yet
Generative AI Mini Projects
39 pages
Synopsis
No ratings yet
Synopsis
51 pages
QR Code Authenticity Classification Task
No ratings yet
QR Code Authenticity Classification Task
3 pages
OCR Project Summary
No ratings yet
OCR Project Summary
4 pages
Object Detection and Recognition: Final Project Title
No ratings yet
Object Detection and Recognition: Final Project Title
6 pages
Image Recognition Using Machine Learning Research Paper
No ratings yet
Image Recognition Using Machine Learning Research Paper
5 pages
Entropy 20 00982 With Cover
No ratings yet
Entropy 20 00982 With Cover
10 pages
Bachelor of Technology
No ratings yet
Bachelor of Technology
39 pages
Image Report-1
No ratings yet
Image Report-1
21 pages
Extract Image Feature Vectors with PyTorch
No ratings yet
Extract Image Feature Vectors with PyTorch
7 pages
Image Recognition with CNNs and Keras
No ratings yet
Image Recognition with CNNs and Keras
5 pages
Internship
No ratings yet
Internship
18 pages
Deep Learning Manual
No ratings yet
Deep Learning Manual
44 pages
Assignment 1 Mosaic
No ratings yet
Assignment 1 Mosaic
3 pages
Automated Mask Detection System
No ratings yet
Automated Mask Detection System
21 pages
IEEE Detection of Authenticity of Images
No ratings yet
IEEE Detection of Authenticity of Images
9 pages
Autonomous Robots & Insurance Claims Automation
No ratings yet
Autonomous Robots & Insurance Claims Automation
2 pages
Presentation For Phase 2
No ratings yet
Presentation For Phase 2
7 pages
10.1007@978 981 13 6577 535
No ratings yet
10.1007@978 981 13 6577 535
10 pages
Nivetha Me P2 PPT
No ratings yet
Nivetha Me P2 PPT
18 pages
Image Forgery Detection - Report
No ratings yet
Image Forgery Detection - Report
52 pages
E-commerce Attribute Prediction with AI
No ratings yet
E-commerce Attribute Prediction with AI
10 pages
Hackaton Presentation New
No ratings yet
Hackaton Presentation New
12 pages
OCR with Neural Networks Using TensorFlow
No ratings yet
OCR with Neural Networks Using TensorFlow
5 pages
Machine Mastermind
No ratings yet
Machine Mastermind
6 pages
Deploy Object Detection Model
No ratings yet
Deploy Object Detection Model
10 pages
Image Recognition System: Project Report
No ratings yet
Image Recognition System: Project Report
19 pages
Deep Learning
No ratings yet
Deep Learning
5 pages
Fast Track Innovation With Generative AI and Machine Learning
No ratings yet
Fast Track Innovation With Generative AI and Machine Learning
18 pages
The Top 5 Uses of Image Recognition - Imagga Blog
No ratings yet
The Top 5 Uses of Image Recognition - Imagga Blog
17 pages
Machine Learning for Object Detection
No ratings yet
Machine Learning for Object Detection
3 pages
Assesment
No ratings yet
Assesment
3 pages
IP MINI GD (Ver02) FINAL DG
No ratings yet
IP MINI GD (Ver02) FINAL DG
18 pages
Report
No ratings yet
Report
6 pages
DIP Mini Project
100% (1)
DIP Mini Project
12 pages
Fresher AI Engineer Assignment
No ratings yet
Fresher AI Engineer Assignment
4 pages
DSN2092 - Summer Industrial Internship
No ratings yet
DSN2092 - Summer Industrial Internship
23 pages
Presentation For Phase 2-1
No ratings yet
Presentation For Phase 2-1
8 pages
Project Synopsis
No ratings yet
Project Synopsis
2 pages
Automated Text Extraction
No ratings yet
Automated Text Extraction
6 pages
Pfa Ieee
No ratings yet
Pfa Ieee
59 pages
CIS 6213 Applied Machine Learning Coursework
No ratings yet
CIS 6213 Applied Machine Learning Coursework
5 pages
AI Annotation in Image
No ratings yet
AI Annotation in Image
7 pages
MA AjamMontassar 201704
No ratings yet
MA AjamMontassar 201704
65 pages
SLM2
No ratings yet
SLM2
32 pages
ஓம் சக்தி ஓம் சரவணா பவா - RESUME MODELS FINALE
No ratings yet
ஓம் சக்தி ஓம் சரவணா பவா - RESUME MODELS FINALE
48 pages
8 ModelArts One-Stop AI Development Platform
No ratings yet
8 ModelArts One-Stop AI Development Platform
52 pages
Ai Image Research
No ratings yet
Ai Image Research
10 pages
Thesis Research Deep Learning
No ratings yet
Thesis Research Deep Learning
18 pages
W01 PracticalProblemsProjects
No ratings yet
W01 PracticalProblemsProjects
27 pages
Image Recognition for Developers
No ratings yet
Image Recognition for Developers
3 pages
CV NguyenVanTuan
No ratings yet
CV NguyenVanTuan
3 pages
An Efficient Real Time Product Recommendation Using Facial Sentiment Analysis PDF
No ratings yet
An Efficient Real Time Product Recommendation Using Facial Sentiment Analysis PDF
6 pages
Ipr Form Preliminary
No ratings yet
Ipr Form Preliminary
8 pages
International Law
No ratings yet
International Law
254 pages
StudentProjectFunding Proposal
No ratings yet
StudentProjectFunding Proposal
13 pages
AI-Powered In-Car Driver Monitoring
No ratings yet
AI-Powered In-Car Driver Monitoring
8 pages
Application For Occupancy Permit
No ratings yet
Application For Occupancy Permit
2 pages
MPS Screw Rohtak
No ratings yet
MPS Screw Rohtak
18 pages
Applied Physics Course Overview
No ratings yet
Applied Physics Course Overview
35 pages
914G Cargador Frontal Hydraulic System
No ratings yet
914G Cargador Frontal Hydraulic System
2 pages
Rigel Cot14
No ratings yet
Rigel Cot14
2 pages
Autotherm: Sterilizer Autoclave Catalog
No ratings yet
Autotherm: Sterilizer Autoclave Catalog
4 pages
Cognitive Computing For Natural Language Processing NLP and Understanding Medical Imaging Narratives
No ratings yet
Cognitive Computing For Natural Language Processing NLP and Understanding Medical Imaging Narratives
5 pages
CYXY
No ratings yet
CYXY
9 pages
Project Sequence
No ratings yet
Project Sequence
2 pages
Charleston SC Earthquake Risk Overview
No ratings yet
Charleston SC Earthquake Risk Overview
8 pages
Enhancing Customer Experience via Retail Therapy
No ratings yet
Enhancing Customer Experience via Retail Therapy
4 pages
Bse Circular
No ratings yet
Bse Circular
1 page
DBL 810514 05 ZK29-DIN en
No ratings yet
DBL 810514 05 ZK29-DIN en
4 pages
CPC Civil Judge Notes Extended
No ratings yet
CPC Civil Judge Notes Extended
4 pages
Teacher Performance Review
No ratings yet
Teacher Performance Review
3 pages
Abyssal Lurkers Chat: Tips & Discussions
No ratings yet
Abyssal Lurkers Chat: Tips & Discussions
7 pages
Indian Auto Industry: Porter's Five Forces
No ratings yet
Indian Auto Industry: Porter's Five Forces
23 pages
Cyber Torts: Legal Challenges & Differences
No ratings yet
Cyber Torts: Legal Challenges & Differences
12 pages
Kit 70160-903 - 02 Design Shimming Process: Medium Duty Series Eaton Pumps
No ratings yet
Kit 70160-903 - 02 Design Shimming Process: Medium Duty Series Eaton Pumps
1 page
Employee Turnover in Singapore
80% (5)
Employee Turnover in Singapore
42 pages
Jeong Jaehyun NCT Profile & Facts
No ratings yet
Jeong Jaehyun NCT Profile & Facts
1 page
1) Image Encryption Using Chaotic Based Artificial Neural Network PDF
No ratings yet
1) Image Encryption Using Chaotic Based Artificial Neural Network PDF
4 pages
Caravelle Hotel Staff Training Overview
No ratings yet
Caravelle Hotel Staff Training Overview
9 pages
Awareness and Utilization of Mobile Health Applications Among Teaching and Non Teaching Staff of Nnamdi Azikiwe University Awka, Anambra State
No ratings yet
Awareness and Utilization of Mobile Health Applications Among Teaching and Non Teaching Staff of Nnamdi Azikiwe University Awka, Anambra State
13 pages
Comprehensive SEO Strategy Checklist
No ratings yet
Comprehensive SEO Strategy Checklist
3 pages
Civil War Overview: Key Events & Figures
No ratings yet
Civil War Overview: Key Events & Figures
26 pages
Simplilearn Invoice for Business Analysis Course
No ratings yet
Simplilearn Invoice for Business Analysis Course
2 pages
Basics of Conflict Management
No ratings yet
Basics of Conflict Management
5 pages
Insurance Management System Project Report
No ratings yet
Insurance Management System Project Report
41 pages
Knife Gate Valve: Product Description
No ratings yet
Knife Gate Valve: Product Description
2 pages