3/19/24, 10:26 AM Problem solving on Boolean Model and Vector Space Model - GeeksforGeeks
Problem solving on Boolean Model and Vector Space Model
Boolean Model:
It is a simple retrieval model based on set theory and boolean algebra. Queries are designed as boolean
expressions which have precise semantics. Retrieval strategy is based on binary decision criterion. Boolean
model considers that index terms are present or absent in a document.
Problem Solving:
Consider 5 documents with a vocabulary of 6 terms
document 1 = ‘ term1 term3 ‘
document 2 = ‘ term 2 term4 term6 ‘
document 3 = ‘ term1 term2 term3 term4 term5 ‘
document 4 = ‘ term1 term3 term6 ‘
document 5 = ‘ term3 term4 ‘
Our documents in boolean model
DSA Practice Searching Algorithms MCQs on Searching Algorithms Tutorial on Searching Algorithms Linear Search Binary Search Ternary Search Jum
term 1 term 2 term 3 term 4 term 5 term 6
document 1 1 0 1 0 0 0
document 2 0 1 0 1 0 1
document 3 1 1 1 1 1 0
document 4 1 0 1 0 0 1
document 5 0 0 1 1 0 0
Consider the query
Find the document consisting of term1 and term3 and not term2
term1 ∧ term3 ∧ ¬ term2
term1 ¬term 2 term 3 term 4 term 5 term 6
[Link] 1/7
3/19/24, 10:26 AM Problem solving on Boolean Model and Vector Space Model - GeeksforGeeks
document 1 1 1 1 0 0 0
document 2 0 0 0 1 0 1
document 3 1 0 1 1 1 0
document 4 1 1 1 0 0 1
document 5 0 1 1 1 0 0
document 1 : 1 ∧ 1∧ 1 = 1
document 2 : 0 ∧ 0 ∧ 0 = 0
document 3 : 1 ∧ 1 ∧ 0 = 0
document 4 : 1 ∧ 1 ∧ 1 = 1
document 5 : 0 ∧ 1 ∧ 1 = 0
Based on the above computation document1 and document4 are relevant to the given query
Vector Model:
The method of performing the operations and the formulas required for the computation is present in the
previous document that is part 1. Consider the following collection of documents.
document1 = ‘one two ‘
document2 = ‘three two four ‘
document3 =’one two three ‘
document4 =’one two ‘
The formulas used
Some terms appear thrice, twice and sometimes only once in the [Link] total number of documents N=4.
Therefore, the IDF values of the terms are:
one --> log2(4/3) = 0.4147
two --> log2(4/4) = 0
three --> log2(4/2) = 1
four -->log2(4/1) = 2
Representation in boolean model
one two three four
document1 1 1 0 0
document2 0 1 1 1
document3 1 1 1 0
document4 1 1 0 0
[Link] 2/7
3/19/24, 10:26 AM Problem solving on Boolean Model and Vector Space Model - GeeksforGeeks
Calculation of term frequency
one --> 3/4 = 0.75
two --> 4/4 = 1
three --> 2/4 = 0.5
four --> 1/4 = 0.25
Calculation of weights ( tf * idf )
weight(one) --> 0.75 * 0.4147 = 0.3110
weight(two) --> 1 * 0 = 0
weight(three) --> 0.5 * 1 = 0.5
weight(four) --> 0.25 * 2 = 0.5
Representation of vector model in terms of weights
one two three four
document1 0.3110 0 0 0
document2 0 0 0.5 0.5
document3 0.3110 0 0.5 0
document4 0.3110 0 0 0
QUERY: Document containing ‘ one three three ‘
Calculation of weights for query terms(term frequency)
weight(one) –> 1/3 = 0.333
weight(three) –> 2/3 = 0.667
Vector representation
Document
Query
Similarity calculation: the
Ranking of the documents, ( for ranking we have followed the method in statistics for the case of allocating same
rank to two different items)
document1 2nd
document2 4th
document3 1st
document4 2nd
[Link] 3/7
3/19/24, 10:26 AM Problem solving on Boolean Model and Vector Space Model - GeeksforGeeks
Since the similarity between document 3 is greater than the similarities between the other documents, 3rd
document is more relevant to the query.
"The DSA course helped me a lot in clearing the interview rounds. It was really very helpful in setting a strong
foundation for my problem-solving skills. Really a great investment, the passion Sandeep sir has towards
DSA/teaching is what made the huge difference." - Gaurav | Placed at Amazon
Before you move on to the world of development, master the fundamentals of DSA on which every advanced
algorithm is built upon. Choose your preferred language and start learning today:
DSA In JAVA/C++
DSA In Python
DSA In JavaScript
Trusted by Millions, Taught by One- Join the best DSA Course Today!
Recommended Problems
Solve Problems
Frequently asked DSA Problems
Maximize your earnings for your published articles in Dev Scripter 2024! Showcase expertise, gain recognition & get extra
compensation while elevating your tech profile.
Last Updated : 30 May, 2021 3
Previous Next
Minimize (max(A[i], B[j], C[k]) - min(A[i], B[j], C[k])) of Aspect Modelling in Sentiment Analysis
three different sorted arrays
Share your thoughts in the comments Add Your Comment
Similar Reads
Document Retrieval using Boolean Model and Vector Space Check if it is possible to reach vector B by rotating vector A
Model and adding vector C to it
Problem Solving for Minimum Spanning Trees (Kruskal’s and Problem solving on scatter matrix
Prim’s)
Solving Binary String Modulo Problem Solving the Multicollinearity Problem with Decision Tree
Boolean Parenthesization Problem | DP-37 What is the difference between Auxiliary space and Space
Complexity?
How to flatten a Vector of Vectors or 2D Vector in C++ Word Wrap problem ( Space optimized solution )
[Link] 4/7
3/19/24, 10:26 AM Problem solving on Boolean Model and Vector Space Model - GeeksforGeeks
D deviprajw…
Article Tags : DSA , Machine Learning , Project , Searching
Practice Tags : Machine Learning, Searching
A-143, 9th Floor, Sovereign Corporate
Tower, Sector-136, Noida, Uttar Pradesh -
201305
Company Explore
About Us Job-A-Thon Hiring Challenge
Legal Hack-A-Thon
Careers GfG Weekly Contest
In Media Offline Classes (Delhi/NCR)
Contact Us DSA in JAVA/C++
Advertise with us Master System Design
GFG Corporate Solution Master CP
Placement Training Program GeeksforGeeks Videos
Geeks Community
Languages DSA
Python Data Structures
Java Algorithms
C++ DSA for Beginners
PHP Basic DSA Problems
GoLang DSA Roadmap
SQL DSA Interview Questions
R Language Competitive Programming
Android Tutorial
Data Science & ML Web Technologies
Data Science With Python HTML
Data Science For Beginner CSS
Machine Learning Tutorial JavaScript
ML Maths TypeScript
Data Visualisation Tutorial ReactJS
[Link] 5/7
3/19/24, 10:26 AM Problem solving on Boolean Model and Vector Space Model - GeeksforGeeks
Pandas Tutorial NextJS
NumPy Tutorial NodeJs
NLP Tutorial Bootstrap
Deep Learning Tutorial Tailwind CSS
Python Tutorial Computer Science
Python Programming Examples GATE CS Notes
Django Tutorial Operating Systems
Python Projects Computer Network
Python Tkinter Database Management System
Web Scraping Software Engineering
OpenCV Tutorial Digital Logic Design
Python Interview Question Engineering Maths
DevOps System Design
Git High Level Design
AWS Low Level Design
Docker UML Diagrams
Kubernetes Interview Guide
Azure Design Patterns
GCP OOAD
DevOps Roadmap System Design Bootcamp
Interview Questions
School Subjects Commerce
Mathematics Accountancy
Physics Business Studies
Chemistry Economics
Biology Management
Social Science HR Management
English Grammar Finance
Income Tax
UPSC Study Material Preparation Corner
Polity Notes Company-Wise Recruitment Process
Geography Notes Resume Templates
History Notes Aptitude Preparation
Science and Technology Notes Puzzles
Economy Notes Company-Wise Preparation
Ethics Notes Companies
Previous Year Papers Colleges
Competitive Exams More Tutorials
JEE Advanced Software Development
UGC NET Software Testing
SSC CGL Product Management
SBI PO Project Management
SBI Clerk Linux
IBPS PO Excel
IBPS Clerk All Cheat Sheets
Free Online Tools Write & Earn
Typing Test Write an Article
[Link] 6/7
3/19/24, 10:26 AM Problem solving on Boolean Model and Vector Space Model - GeeksforGeeks
Image Editor Improve an Article
Code Formatters Pick Topics to Write
Code Converters Share your Experiences
Currency Converter Internships
Random Number Generator
Random Password Generator
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
[Link] 7/7