0% found this document useful (0 votes)

86 views12 pages

Generalized Vector Model Explained

The generalized vector model represents documents and terms as vectors in a subspace defined by minterms rather than individual terms. Minterms are pairwise orthogonal but capture dependencies between terms by representing co-occurrences. Term vectors are computed as weighted sums of minterm vectors based on term frequencies. Dependencies between terms are measured by correlations between their vectors based on shared minterms. This model allows terms to be non-orthogonal while maintaining mathematical properties like in the standard vector model.

Uploaded by

VishnuDhanabalan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

86 views12 pages

Generalized Vector Model Explained

Uploaded by

VishnuDhanabalan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

Generalized Vector Model

Classic models enforce independence of index terms. For the Vector model:

Set of term vectors {k1, k2, ..., kt} are linearly independent and form a basis for the subspace of interest. ki kj = 0

Frequently, this is interpreted as:

i,j

In 1985, Wong, Ziarko, and Wong proposed an interpretation in which the set of terms is linearly independent, but not pairwise orthogonal.

Key Idea:

the generalized vector model, two index terms might be non-orthogonal and are represented in terms of smaller components (minterms). As before let,
wij be the weight associated with [ki,dj] {k1, k2, ..., kt} be the set of all terms

these weights are all binary, all patterns of occurrence of terms within documents can be represented by the minterms:
m1 = (0,0, ..., 0) m5 = (0,0,1, ..., 0) m2 = (1,0, ..., 0) m3 = (0,1, ..., 0) t m4 = (1,1, ..., 0) m2 = (1,1,1, ..., 1) In here, m2 indicates documents in which solely the term k1 occurs.

Key Idea:
The

basis for the generalized vector model is formed by a set of 2 vectors defined over the set of minterms, as follows: 0 1 2 ...
m1

= (1, 0, 0, ..., 0, 0) m2 = (0, 1, 0, ..., 0, 0) m3 = (0, 0, 1, ..., 0, 0) t m2 = (0, 0, 0, ..., 0, 1)

Notice
i,j

that,
mi mj = 0 i.e., pairwise orthogonal

Key Idea:
Minterm
The

vectors are pairwise orthogonal. But, this does not mean that the index terms are independent:
minterm m4 is given by: m4 = (1, 1, 0, ..., 0, 0) This minterm indicates the occurrence of the terms k1 and k2 within a same document. If such document exists in a collection, we say that the minterm m4 is active and that a dependency between these two terms is induced. The generalized vector model adopts as a basic foundation the notion that cooccurence of terms within documents induces dependencies among them.

Forming the Term Vectors

The
ki

vector associated with the term ki is computed as: ci,r mr

=
r, g i(m r )=1

sqrt( r, g
ci,r

(m r

)=1 ci,r )

dj | l, gl(dj)=gl(mr)

wij

The

weight c i,r associated with the pair [ki,mr] sums up the weights of the term ki in all the documents which have a term occurrence pattern given by mr. Notice that for a collection of size N, only N minterms t affect the ranking (and not 2 ).

Dependency between Index Terms

A degree

of correlation between the terms ki and kj can now be computed as: ki kj = r, g

(m )=1 g (mr )=1 i r
j

c i,r * c j,r

This

degree of correlation sums up (in a weighted form) the dependencies between ki and kj induced by the documents in the collection (represented by the mr minterms).

The Generalized Vector Model: An Example k1

d2 d4 d1 d6 d5 d7 d3

d1 d2 d3 d4 d5 d6 d7 q

k1 2 1 0 2 1 1 0 1

k2 0 0 1 0 2 2 5 2

k3 1 0 3 0 4 0 0 3

Computation of C i,r
wij
d1 d2 d3 d4 d5 d6 d7 q k1 2 1 0 2 1 0 0 1 k2 0 0 1 0 2 2 5 2 k3 1 0 3 0 4 2 0 3

d1 = m6 d2 = m2 d3 = m7 d4 = m2 d5 = m8 d6 = m7 d7 = m3 q = m8

k1 1 1 0 1 1 0 0 1

k2 0 0 1 0 1 1 1 1

k3 1 0 1 0 1 1 0 1

c i,r = dj | l, gl(dj)=gl(mr) wij

m1 m2 m3 m4 m5 m6 m7 m8 c1,r 0 3 0 0 0 2 0 1 c2,r 0 0 5 0 0 0 3 2 c3,r 0 0 0 0 0 1 5 4

Computation of Index Term Vectors

m1 m2 m3 m4 m5 m6 m7 m8 c1,r 0 3 0 0 0 2 0 1 c2,r 0 0 5 0 0 0 3 2 c3,r 0 0 0 0 0 1 5 4

1 (3 m2 + 2 m6 + m8 ) sqrt(32 + 22 + 12 ) k2 = 1 (5 m3 + 3 m7 + 2 m8 ) 2 2 2 sqrt(5 + 3 + 2 ) k3 = 1 (1 m6 + 5 m7 + 4 m8 ) 2 2 2 sqrt(1 + 5 + 4 )

Computation of Document Vectors

d1 d2 d3 d4 d5 d6 d7 q k1 2 1 0 2 1 0 0 1 k2 0 0 1 0 2 2 5 2 k3 1 0 3 0 4 2 0 3

d1 d2 d3 d4 d5 d6 d7 q

= 2 k1 + k3 = k1 = k2 + 3 k3 = 2 k1 = k1 + 2 k2 + 4 k3 = 2 k2 + 2 k3 = 5 k2 = k1 + 2 k2 + 3 k3

Ranking Computation
k1

1 (3 m2 + 2 m6 + m8 ) 2 2 2 sqrt(3 + 2 + 1 ) k2 = 1 (5 m3 + 3 m7 + 2 m8 ) sqrt(52 + 32 + 22 ) k3 = 1 (1 m6 + 5 m7 + 4 m8 ) 2 2 2 sqrt(1 + 5 + 4 )

d1 d2 d3 d4 d5 d6 d7 q

= 2 k1 + k3 = k1 = k2 + 3 k3 = 2 k1 = k1 + 2 k2 + 4 k3 = 2 k2 + 2 k3 = 5 k2 = k1 + 2 k2 + 3 k3

Conclusions

Model considers correlations among index terms Not clear in which situations it is superior to the standard Vector model Computation costs are higher Model does introduce interesting new ideas

Chapter 4 - Part II
No ratings yet
Chapter 4 - Part II
44 pages
IR Lecture 4b
No ratings yet
IR Lecture 4b
57 pages
Chapter 6
No ratings yet
Chapter 6
55 pages
User Search Techniques & Visualization
No ratings yet
User Search Techniques & Visualization
55 pages
IR Lecture 4b
No ratings yet
IR Lecture 4b
57 pages
Module 3 Indexing Part A
No ratings yet
Module 3 Indexing Part A
46 pages
Vector and Matrix Exercises Guide
No ratings yet
Vector and Matrix Exercises Guide
70 pages
Very Sparse Random Projections: Ping Li Trevor J. Hastie Kenneth W. Church
No ratings yet
Very Sparse Random Projections: Ping Li Trevor J. Hastie Kenneth W. Church
10 pages
Matrix Methods: Vectors and Applications
No ratings yet
Matrix Methods: Vectors and Applications
164 pages
Additional Exercises For Vectors, Matrices, and Least Squares
No ratings yet
Additional Exercises For Vectors, Matrices, and Least Squares
41 pages
Boolean and Vector Space Retrieval Models
No ratings yet
Boolean and Vector Space Retrieval Models
27 pages
ML L02 Lin - Alg Review
No ratings yet
ML L02 Lin - Alg Review
58 pages
Overview of Information Retrieval Models
No ratings yet
Overview of Information Retrieval Models
8 pages
Linear Algebra Tutorial Guide
No ratings yet
Linear Algebra Tutorial Guide
106 pages
IDS Lec4
No ratings yet
IDS Lec4
2 pages
Week 3
No ratings yet
Week 3
10 pages
Understanding Information Retrieval Models
No ratings yet
Understanding Information Retrieval Models
51 pages
Overview of Information Retrieval Models
100% (1)
Overview of Information Retrieval Models
26 pages
SM (1e) PDF
No ratings yet
SM (1e) PDF
212 pages
SolutionManual Ch1 2
100% (1)
SolutionManual Ch1 2
14 pages
Combinatorial Club Formation Rules
No ratings yet
Combinatorial Club Formation Rules
7 pages
LSA, pLSA, and LDA Acronyms, Oh My!
No ratings yet
LSA, pLSA, and LDA Acronyms, Oh My!
114 pages
Understanding Information Retrieval Models
No ratings yet
Understanding Information Retrieval Models
46 pages
Laa 2024
No ratings yet
Laa 2024
46 pages
An Approximate Algorithm For Maximum Inner Product Search Over Streaming Sparse Vectors
No ratings yet
An Approximate Algorithm For Maximum Inner Product Search Over Streaming Sparse Vectors
44 pages
Overview of Information Retrieval Models
No ratings yet
Overview of Information Retrieval Models
34 pages
M.Sc. Physics Mathematical Physics Syllabus
No ratings yet
M.Sc. Physics Mathematical Physics Syllabus
114 pages
Lecture 3 - Introduction To Linear Algebra, Probability and Statistics (DONE!!)
No ratings yet
Lecture 3 - Introduction To Linear Algebra, Probability and Statistics (DONE!!)
41 pages
SSMD Presentation Unit4
No ratings yet
SSMD Presentation Unit4
22 pages
Vector Space Model
No ratings yet
Vector Space Model
7 pages
4 IRModels
No ratings yet
4 IRModels
46 pages
Understanding Information Retrieval Models
No ratings yet
Understanding Information Retrieval Models
30 pages
Linear Notes
No ratings yet
Linear Notes
26 pages
Understanding Eigenvectors and Multiplicity
No ratings yet
Understanding Eigenvectors and Multiplicity
26 pages
Applied Linear Algebra Concepts Overview
No ratings yet
Applied Linear Algebra Concepts Overview
37 pages
Introduction To Linear Algebra
No ratings yet
Introduction To Linear Algebra
33 pages
LinearAI-DS Mid ch1-6 2021S2 DR - Omar
No ratings yet
LinearAI-DS Mid ch1-6 2021S2 DR - Omar
10 pages
Understanding IR Models and Ranking
No ratings yet
Understanding IR Models and Ranking
43 pages
Understanding Information Retrieval Models
No ratings yet
Understanding Information Retrieval Models
30 pages
Math for ML Enthusiasts
No ratings yet
Math for ML Enthusiasts
100 pages
Introduction to IR Models and Techniques
100% (1)
Introduction to IR Models and Techniques
32 pages
Overview of Information Retrieval Models
100% (1)
Overview of Information Retrieval Models
32 pages
Chapter 4 IR Models
No ratings yet
Chapter 4 IR Models
34 pages
Linear Algebra 1730400240
No ratings yet
Linear Algebra 1730400240
26 pages
SPHM11 - I Sem - Mathematical Physics
No ratings yet
SPHM11 - I Sem - Mathematical Physics
105 pages
BITS Pilani Hyderabad Campus Test 2016
No ratings yet
BITS Pilani Hyderabad Campus Test 2016
2 pages
Boolean and Vector Space Retrieval Models
No ratings yet
Boolean and Vector Space Retrieval Models
33 pages
Data Dimensionality Reduction
No ratings yet
Data Dimensionality Reduction
37 pages
Fundamentals of Vector Spaces Guide
No ratings yet
Fundamentals of Vector Spaces Guide
61 pages
Understanding Vector Models and Similarity Search
No ratings yet
Understanding Vector Models and Similarity Search
10 pages
Module 2 Analytic Geometry
No ratings yet
Module 2 Analytic Geometry
34 pages
Lec 3
No ratings yet
Lec 3
43 pages
Lesson 1-12
No ratings yet
Lesson 1-12
227 pages
Introduction to Vectors and Operations
No ratings yet
Introduction to Vectors and Operations
16 pages
Module 03 PPT Vector Spaces
No ratings yet
Module 03 PPT Vector Spaces
58 pages
Inverse of 3x3 Matrix
No ratings yet
Inverse of 3x3 Matrix
10 pages
Broyden's Method
No ratings yet
Broyden's Method
11 pages
Ring Theory and Linear Algebra II Exam
No ratings yet
Ring Theory and Linear Algebra II Exam
7 pages
Definition of The One Dimensional Discrete Fourier Transform (DFT)
No ratings yet
Definition of The One Dimensional Discrete Fourier Transform (DFT)
9 pages
Solution Manual For Linear Algebra With Applications 2nd Edition Holt 1464193347 9781464193347 PDF Download Full Book With All Chapters
100% (119)
Solution Manual For Linear Algebra With Applications 2nd Edition Holt 1464193347 9781464193347 PDF Download Full Book With All Chapters
60 pages
Numerical Method (PQ)
No ratings yet
Numerical Method (PQ)
27 pages
Sequential Quadratic Programming Methods: Abstract. in His 1963 PHD Thesis, Wilson Proposed The First Sequential Quadratic
No ratings yet
Sequential Quadratic Programming Methods: Abstract. in His 1963 PHD Thesis, Wilson Proposed The First Sequential Quadratic
78 pages
Calculus II: Chain Rule & Gradients
No ratings yet
Calculus II: Chain Rule & Gradients
16 pages
Linear Algebra Basics for Students
No ratings yet
Linear Algebra Basics for Students
2 pages
Matrix2 Questions
No ratings yet
Matrix2 Questions
6 pages
Cmu Physics For Engineers Vector Addition Sample Problem
No ratings yet
Cmu Physics For Engineers Vector Addition Sample Problem
1 page
LQR for Differential Drive Robot
No ratings yet
LQR for Differential Drive Robot
8 pages
Matrix Operations for Students
No ratings yet
Matrix Operations for Students
9 pages
DeterminantsContinuityDifferentiability Questions
No ratings yet
DeterminantsContinuityDifferentiability Questions
33 pages
Calculus 3 Cheat Sheet One Sheet
No ratings yet
Calculus 3 Cheat Sheet One Sheet
1 page
SPPU FE 2024 Pattern PYQ (Imp Solution Hub)
No ratings yet
SPPU FE 2024 Pattern PYQ (Imp Solution Hub)
31 pages
Cambridge Methods 1/2 - Chapter 17 Differentiation and Antidifferentiation
No ratings yet
Cambridge Methods 1/2 - Chapter 17 Differentiation and Antidifferentiation
32 pages
Matlab Tutorial
No ratings yet
Matlab Tutorial
18 pages
Collineations in Projective Planes
No ratings yet
Collineations in Projective Planes
7 pages
Mathmission For Xii (2023-24) - O.P. Gupta
100% (3)
Mathmission For Xii (2023-24) - O.P. Gupta
192 pages
Engineering Maths K-Wiki Chapter 1 Linear Algebra
No ratings yet
Engineering Maths K-Wiki Chapter 1 Linear Algebra
52 pages
Solving Linear Systems with Matrices
No ratings yet
Solving Linear Systems with Matrices
21 pages
Analyzing Delay in Array Multipliers
No ratings yet
Analyzing Delay in Array Multipliers
24 pages
Math Formula Sheet: Relations & Functions
No ratings yet
Math Formula Sheet: Relations & Functions
31 pages
Class 12 Math Test Ch1 To 6
No ratings yet
Class 12 Math Test Ch1 To 6
2 pages
Exercise 3 Sol
No ratings yet
Exercise 3 Sol
14 pages
Nearest Neighbor Methods in R
No ratings yet
Nearest Neighbor Methods in R
23 pages
Narrowband Direction of Arrival Estimation For Antenna Arrays Synthesis Lectures On Antennas
No ratings yet
Narrowband Direction of Arrival Estimation For Antenna Arrays Synthesis Lectures On Antennas
83 pages
Unit 1 Matrices
No ratings yet
Unit 1 Matrices
22 pages
CH-2-Solution of Linear System - Spring - 24-25
No ratings yet
CH-2-Solution of Linear System - Spring - 24-25
11 pages

Generalized Vector Model Explained

Uploaded by

Generalized Vector Model Explained

Uploaded by

Generalized Vector Model

Frequently, this is interpreted as:

= (1, 0, 0, ..., 0, 0) m2 = (0, 1, 0, ..., 0, 0) m3 = (0, 0, 1, ..., 0, 0) t m2 = (0, 0, 0, ..., 0, 1)

Forming the Term Vectors

vector associated with the term ki is computed as: ci,r mr

Dependency between Index Terms

of correlation between the terms ki and kj can now be computed as: ki kj = r, g

The Generalized Vector Model: An Example k1

c i,r = dj | l, gl(dj)=gl(mr) wij

Computation of Index Term Vectors

1 (3 m2 + 2 m6 + m8 ) sqrt(32 + 22 + 12 ) k2 = 1 (5 m3 + 3 m7 + 2 m8 ) 2 2 2 sqrt(5 + 3 + 2 ) k3 = 1 (1 m6 + 5 m7 + 4 m8 ) 2 2 2 sqrt(1 + 5 + 4 )

Computation of Document Vectors

1 (3 m2 + 2 m6 + m8 ) 2 2 2 sqrt(3 + 2 + 1 ) k2 = 1 (5 m3 + 3 m7 + 2 m8 ) sqrt(52 + 32 + 22 ) k3 = 1 (1 m6 + 5 m7 + 4 m8 ) 2 2 2 sqrt(1 + 5 + 4 )

You might also like