0% found this document useful (0 votes)

23 views11 pages

IR Midsem

Uploaded by

manishkumarthalor222

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views11 pages

IR Midsem

Uploaded by

manishkumarthalor222

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

2/7/22, 8:34 AM IR Midsem

IR Midsem
Description:
1. The exam contains 24 MCQs.
2. There may be more than one option correct for each question.
3. Some questions are worth 1 point, the rest -> 2 points.
4. There is no partial marking. Full marks will be awarded for a question if and only if all
correct and no wrong options are selected.
5. No negative marking.

Important Guidelines:
1. Open book
2. You may use a calculator (**do not use mobile phone calculator)
3. Kindly ensure your videos are on.
4. No extension will be given.

[email protected] Switch account

Your email will be recorded when you submit this form

If we use bigram indexes, which of the following words would be falsely 2 points
enumerated by co*me?

come

comment

income

coulome

https://docs.google.com/forms/d/e/1FAIpQLSfLz0PuhiDjyQ5fV4QDqfagw-H3-9EuV4Iqvmn7ZZM_Qx0UJg/viewform 1/11
2/7/22, 8:34 AM IR Midsem

Minimum how many copies of data are maintained in HDFS ? 1 point

What is the idf of the term which occurs in every document? 1 point

log10(N)

log10(1/N)

Rank the following documents in decreasing order according to their tf- 2 points
idf score wrt query = “All vehicles including car auto bike bus are stopped
due to accident”. Vocabulary = {car, auto, bike, bus} (*Use tf-idf = tf x idf)

doc1, doc2 , doc3

doc2, doc3, doc1

doc3, doc2, doc1

doc1, doc3, doc2

https://docs.google.com/forms/d/e/1FAIpQLSfLz0PuhiDjyQ5fV4QDqfagw-H3-9EuV4Iqvmn7ZZM_Qx0UJg/viewform 2/11
2/7/22, 8:34 AM IR Midsem

In logarithmic merge, where n=3. We have 47 tokens to be processed. Find 2 points

which all indexes including auxiliary indexes (Z0, I0, I1, I2, I3, I4 ) are in use
after all the tokens are used. See the table for representation. (consider
Z0 < n).

1, 1, 1, 0, 1, 0

0, 1, 1, 1, 1, 0

1, 1, 1, 1, 1, 0

0, 0, 0, 1, 1, 1

Which of the following are the functions of parser in distributed indexing? 1 point

Sorts and writes to a posting list.

Writes pairs into k partitions, where k ∈ N.

Reads document at a time and emits a pair.

Assigns a split into an idle machine.

Collects all pairs for one partition

https://docs.google.com/forms/d/e/1FAIpQLSfLz0PuhiDjyQ5fV4QDqfagw-H3-9EuV4Iqvmn7ZZM_Qx0UJg/viewform 3/11
2/7/22, 8:34 AM IR Midsem

How would the wild card query qu*ry be expressed for lookup in the 1 point
permutation index?

ry$*qu

ry$qu*

$qu*ry

qu*ry$

Compute edit distance between “cats” and “fast”, (with insertion, deletion 2 points

and substitution only).

Which of the following does not improve the performance of distributed 2 points

processing?

None of above

maintaing checksum of data

replication of data

partitioning of data

https://docs.google.com/forms/d/e/1FAIpQLSfLz0PuhiDjyQ5fV4QDqfagw-H3-9EuV4Iqvmn7ZZM_Qx0UJg/viewform 4/11
2/7/22, 8:34 AM IR Midsem

Which of the following can not run on HDFS? 1 point

MapReduce

Spark

Oracle Database

Hbase

Real time processing is also called as 2 points

Processing group of events less than minute

Per day processing

Per event processing

Per hour processing

In which launguage MapReduce is written ? 1 point

Python

C++

Java

Scala

https://docs.google.com/forms/d/e/1FAIpQLSfLz0PuhiDjyQ5fV4QDqfagw-H3-9EuV4Iqvmn7ZZM_Qx0UJg/viewform 5/11
2/7/22, 8:34 AM IR Midsem

Observed word is “acress”. Use the below table for finding the most 2 points

suitable correct word. (Dictionary contains only candidate words)

across

actress

access

acres

Which of following is not a data ingestion tool? 2 points

spark

kafka

flume

sqoop

What is purpose of Namenode ? 2 points

Store data

None of the above

Store metadata

Schedule jobs

https://docs.google.com/forms/d/e/1FAIpQLSfLz0PuhiDjyQ5fV4QDqfagw-H3-9EuV4Iqvmn7ZZM_Qx0UJg/viewform 6/11
2/7/22, 8:34 AM IR Midsem

For the query 'bord', state the word from the dictionary which has the 2 points
second minimum Jaccard Coefficient using character 2-gram index.
Dictionary = {aboard, border, dropped, lord}.

border

lord

dropped

aboard

Edit distance between any two strings s1 and s2 is upper bounded by? (|s| 1 point
denotes the length of the string)

|s1| - |s2|

min( |s1| , |s2| )

max( |s1| , |s2| )

|s1| + |s2|

Can the tf-idf weight of term in a document exceed 1? 1 point

True

False

https://docs.google.com/forms/d/e/1FAIpQLSfLz0PuhiDjyQ5fV4QDqfagw-H3-9EuV4Iqvmn7ZZM_Qx0UJg/viewform 7/11
2/7/22, 8:34 AM IR Midsem

Let’s say the length of the embedding vectors of songs is directly 2 points

proportional to their popularity. You want to calculate the similarity

between songs. Which of the following is/are true ?

If you switch from cosine similarity to dot product, popular songs become more
similar to only other popular songs.

If you switch from cosine similarity to dot product, popular songs become more
similar to all songs in general.

If you switch from dot product to cosine similarity, popular songs become less similar
than less popular songs.

If you switch from dot product to cosine similarity, popular songs become more
similar than less popular songs.

No change in song similarities when switching from cosine similarity to dot product

No change in song similarities when switching from dot product to cosine similarity

Paragraph for the next 3 questions

Q-abcd
D1 - a a c c
D2 - b d
Here a,b,c,d are individual tokens.
For the above set of query(Q) and documents(D1, D2), use the lnc.ltc weighting scheme to compute the
ranking score and answer the following:
(Roundup each calculation up to 2 decimal places. Use log10)

https://docs.google.com/forms/d/e/1FAIpQLSfLz0PuhiDjyQ5fV4QDqfagw-H3-9EuV4Iqvmn7ZZM_Qx0UJg/viewform 8/11
2/7/22, 8:34 AM IR Midsem

Which of the following is/are true? 2 points

D2 has better/larger score than D1

Whichever has better score, it is by a low margin (|difference| <= 0.02)

Whichever has better score, it is by a high margin (|difference| > 0.02)

D1 has better/larger score than D2

Now, if we take the euclidean distance between the normalized vectors 2 points

(instead of product), which of the following is/are true? (The ranking order
we talk about in this question is the one we get after Q1)

The ranking order remains the same and the margin is low (|difference| <= 0.02)

The ranking order remains the same and the margin is high (|difference| > 0.02)

The ranking order remains the same

The ranking order reverses

https://docs.google.com/forms/d/e/1FAIpQLSfLz0PuhiDjyQ5fV4QDqfagw-H3-9EuV4Iqvmn7ZZM_Qx0UJg/viewform 9/11
2/7/22, 8:34 AM IR Midsem

Now, if we take the product without normalizing the document vectors, 2 points

which of the following is/are true? (The ranking order we talk about in this
question is the one we get after Q1)

The ranking order remains the same and the margin is low (|difference| <= 0.02)

The ranking order remains the same and the margin is high (|difference| > 0.02)

The ranking order reverses

The ranking order remains the same

Paragraph for the next 2 questions

Q-abc
D1 - a a d
D2 - b c a
D3 - a a
Here a,b,c,d are individual tokens.
While ranking the documents using Binary Independence Model (BIM), in a particular iteration, we get
user feedback which tells us that -
(i) All documents are relevant
(ii) A term/token is relevant to a document if the document contains that specific term/token.
Now for this particular iteration, answer the following:
(Use log10 wherever log is required)

Which of the following is/are true? (Hint: Use the contingency table. For 2 points
smoothing, add 0.5 to every count in the table)

The log-odds ratio for term ‘a’ is 0.845

The log-odds ratio for term ‘c’ is -0.14

The log-odds ratio for term ‘a’ is 0.645

The log-odds ratio for term ‘b’ is -0.22

https://docs.google.com/forms/d/e/1FAIpQLSfLz0PuhiDjyQ5fV4QDqfagw-H3-9EuV4Iqvmn7ZZM_Qx0UJg/viewform 10/11
2/7/22, 8:34 AM IR Midsem

Which of the following is/are true? (RSV(D) denotes the Retrieval Status 2 points
Value for document D)

RSV(D1) = 1.69

RSV(D1) = 1.29

RSV(D1) = RSV(D3)

RSV(D2) = 0.60

A copy of your responses will be emailed to [email protected].

Submit Clear form

This form was created inside of IIIT Delhi. Report Abuse

Forms

https://docs.google.com/forms/d/e/1FAIpQLSfLz0PuhiDjyQ5fV4QDqfagw-H3-9EuV4Iqvmn7ZZM_Qx0UJg/viewform 11/11

IR Midsem: What Is The Idf of The Term Which Occurs in Every Document?
No ratings yet
IR Midsem: What Is The Idf of The Term Which Occurs in Every Document?
9 pages
IR Midsem Sol
No ratings yet
IR Midsem Sol
9 pages
IR Quiz 2 - Last - 2
No ratings yet
IR Quiz 2 - Last - 2
4 pages
SEO Quiz on Information Retrieval
73% (11)
SEO Quiz on Information Retrieval
2 pages
BITS Pilani Hyderabad Campus Test 2016
No ratings yet
BITS Pilani Hyderabad Campus Test 2016
2 pages
Information Retrieval Exam Questions
No ratings yet
Information Retrieval Exam Questions
2 pages
IRDM Assignment-I PDF
No ratings yet
IRDM Assignment-I PDF
4 pages
Information Retrieval Models Overview
No ratings yet
Information Retrieval Models Overview
420 pages
Information Retrieval Exam Questions 2021
No ratings yet
Information Retrieval Exam Questions 2021
16 pages
MapReduce and Data Processing Quiz
No ratings yet
MapReduce and Data Processing Quiz
19 pages
CS317 Information Retrieval Exam Solutions
No ratings yet
CS317 Information Retrieval Exam Solutions
10 pages
CS3308 Information Retrieval Quiz
50% (2)
CS3308 Information Retrieval Quiz
63 pages
Complete System Design
No ratings yet
Complete System Design
22 pages
Binary Independence Model in IR
No ratings yet
Binary Independence Model in IR
3 pages
Subject Information Retrieval Questions
No ratings yet
Subject Information Retrieval Questions
4 pages
Twitter Blue Checkmark Changes Explained
No ratings yet
Twitter Blue Checkmark Changes Explained
41 pages
CSI 4107 - Winter 2016 - Midterm
0% (1)
CSI 4107 - Winter 2016 - Midterm
10 pages
Mid-Semester Test Solutions: Information Retrieval
100% (2)
Mid-Semester Test Solutions: Information Retrieval
4 pages
CSE 442 Web Search & Mining Exam 2021
No ratings yet
CSE 442 Web Search & Mining Exam 2021
3 pages
Ir End Pyq Sols
No ratings yet
Ir End Pyq Sols
8 pages
Final Solutions
No ratings yet
Final Solutions
21 pages
Information Retrieval Quiz
No ratings yet
Information Retrieval Quiz
49 pages
CS 728 Midterm Exam: Database Systems
No ratings yet
CS 728 Midterm Exam: Database Systems
11 pages
CSE 4053 Information Retrieval Assignments
No ratings yet
CSE 4053 Information Retrieval Assignments
4 pages
CS347 Spring 2001 Mid-term Solutions
No ratings yet
CS347 Spring 2001 Mid-term Solutions
5 pages
Stanford CS347 Spring 2001 Midterm Solutions
No ratings yet
Stanford CS347 Spring 2001 Midterm Solutions
5 pages
IR - Midsem Question Paper - 2024 - Solutionfull
No ratings yet
IR - Midsem Question Paper - 2024 - Solutionfull
7 pages
Stanford CS347 Spring 2001 Mid-term Solutions
No ratings yet
Stanford CS347 Spring 2001 Mid-term Solutions
6 pages
Inverted Index and Query Processing Solutions
No ratings yet
Inverted Index and Query Processing Solutions
5 pages
3 Retrieval Models
No ratings yet
3 Retrieval Models
87 pages
IR Quiz 1
No ratings yet
IR Quiz 1
4 pages
Nic Scientist B Question Paper 2018
No ratings yet
Nic Scientist B Question Paper 2018
23 pages
CS 771A: Intro To Machine Learning, IIT Kanpur Name Roll No Dept
No ratings yet
CS 771A: Intro To Machine Learning, IIT Kanpur Name Roll No Dept
4 pages
Sem Endsems
No ratings yet
Sem Endsems
9 pages
Lecture 5 - Scoring, Term Weighting, Vector Space Model - Part 1
No ratings yet
Lecture 5 - Scoring, Term Weighting, Vector Space Model - Part 1
45 pages
Term-Document Matrix and IR Analysis
No ratings yet
Term-Document Matrix and IR Analysis
3 pages
Duet 18
No ratings yet
Duet 18
21 pages
Lecture 04
No ratings yet
Lecture 04
41 pages
CS 3308 - Information Retrieval Self Quiz - Unit 01 - Unit 088 - University of The People
No ratings yet
CS 3308 - Information Retrieval Self Quiz - Unit 01 - Unit 088 - University of The People
49 pages
Dbi Full
No ratings yet
Dbi Full
71 pages
Final Exam (Spring 2020 - V1)
No ratings yet
Final Exam (Spring 2020 - V1)
11 pages
COMP1942 Data Visualization Midterm Exam
No ratings yet
COMP1942 Data Visualization Midterm Exam
5 pages
Module 3 Indexing Part A
No ratings yet
Module 3 Indexing Part A
46 pages
Chapter 6 - Scoring Term Weighting and Vector Space Model
No ratings yet
Chapter 6 - Scoring Term Weighting and Vector Space Model
43 pages
IR MCQ With Answers
100% (1)
IR MCQ With Answers
23 pages
Document Scoring in Information Retrieval
100% (3)
Document Scoring in Information Retrieval
38 pages
NIC Scientist B Exam Paper 2 Guide
No ratings yet
NIC Scientist B Exam Paper 2 Guide
25 pages
12 Midterm Review
No ratings yet
12 Midterm Review
18 pages
TF-IDF and Ranked Retrieval Basics
No ratings yet
TF-IDF and Ranked Retrieval Basics
51 pages
Information Retrieval Exam Guide
No ratings yet
Information Retrieval Exam Guide
2 pages
UIUC ECE 462 Exam 1 Solution
No ratings yet
UIUC ECE 462 Exam 1 Solution
3 pages
HCU - PH.D Computer Science 2021
No ratings yet
HCU - PH.D Computer Science 2021
18 pages
Computer Applications Computer Science
No ratings yet
Computer Applications Computer Science
45 pages
Final Exam Prep: IR Systems & Metrics
No ratings yet
Final Exam Prep: IR Systems & Metrics
38 pages
Self-Quiz Review: Information Retrieval
No ratings yet
Self-Quiz Review: Information Retrieval
50 pages
Homework Exercises on TF-IDF and Document Ranking
100% (1)
Homework Exercises on TF-IDF and Document Ranking
11 pages
Sp09midterm Revised
No ratings yet
Sp09midterm Revised
6 pages
BSCCS2001 Mock Quiz 2 Solutions
No ratings yet
BSCCS2001 Mock Quiz 2 Solutions
20 pages
Lecture17 Linkanalysis
No ratings yet
Lecture17 Linkanalysis
58 pages
Ans B, C
No ratings yet
Ans B, C
2 pages
IR Quiz 2 Solution
No ratings yet
IR Quiz 2 Solution
3 pages
IMGintro
No ratings yet
IMGintro
47 pages
Lecture 4
No ratings yet
Lecture 4
8 pages
IR Quiz 3: Which Node/feature Can Be Optimally Taken To Perform Split While Building The Decision Tree?
No ratings yet
IR Quiz 3: Which Node/feature Can Be Optimally Taken To Perform Split While Building The Decision Tree?
5 pages
IMGhistory 00
No ratings yet
IMGhistory 00
59 pages
Week 5 8
No ratings yet
Week 5 8
80 pages
Lecture 6
No ratings yet
Lecture 6
15 pages
Week 2
No ratings yet
Week 2
22 pages
Lecture 7
No ratings yet
Lecture 7
17 pages
Lecture 10 Sound Spatialization
No ratings yet
Lecture 10 Sound Spatialization
10 pages
Week 3
No ratings yet
Week 3
29 pages
Breast Cancer Sourcebook, 6th Edition Williams New Release 2025
No ratings yet
Breast Cancer Sourcebook, 6th Edition Williams New Release 2025
84 pages
Catalog 12 - FHFSSF
No ratings yet
Catalog 12 - FHFSSF
8 pages
KEY Viet Teacher KET W7 L43 Grammar Revision
No ratings yet
KEY Viet Teacher KET W7 L43 Grammar Revision
6 pages
Academic Registrar's Profile
No ratings yet
Academic Registrar's Profile
4 pages
Game Development and Programming Skills
No ratings yet
Game Development and Programming Skills
2 pages
Intension, Inference, and Relevance
No ratings yet
Intension, Inference, and Relevance
3 pages
Podar International School Timetable
No ratings yet
Podar International School Timetable
1 page
Tariqul Islam
No ratings yet
Tariqul Islam
3 pages
MID-TERM ASSIGNMENT English SS 1 & 2
No ratings yet
MID-TERM ASSIGNMENT English SS 1 & 2
8 pages
The Effect of Grammar Teaching Sentence Combining
No ratings yet
The Effect of Grammar Teaching Sentence Combining
19 pages
Skybird Aviation
No ratings yet
Skybird Aviation
10 pages
K-10 Mathematics Curriculum Guide
No ratings yet
K-10 Mathematics Curriculum Guide
128 pages
Case Reports S/K
No ratings yet
Case Reports S/K
21 pages
Diss Unit Test
No ratings yet
Diss Unit Test
2 pages
Senior High Social Sciences Exam 2024
No ratings yet
Senior High Social Sciences Exam 2024
4 pages
Modelling The Longitudinal Dynamics of Long Freigh
No ratings yet
Modelling The Longitudinal Dynamics of Long Freigh
6 pages
Mobile Sensing in Psychology
No ratings yet
Mobile Sensing in Psychology
819 pages
Bacancy Technology Brochure
No ratings yet
Bacancy Technology Brochure
12 pages
HRPTA Meeting Summary August 2019
No ratings yet
HRPTA Meeting Summary August 2019
2 pages
Phononic Bright and Dark States: Investigating Multi-Mode Light-Matter Interactions With A Single Trapped Ion
No ratings yet
Phononic Bright and Dark States: Investigating Multi-Mode Light-Matter Interactions With A Single Trapped Ion
12 pages
Miniature Golf Course Design Rubric
No ratings yet
Miniature Golf Course Design Rubric
2 pages
Identify Implied Main Ideas Guide
100% (1)
Identify Implied Main Ideas Guide
3 pages
PRINCE2 Agile Foundation Demo
No ratings yet
PRINCE2 Agile Foundation Demo
5 pages
Academic Toppers (2, 3 Place)
No ratings yet
Academic Toppers (2, 3 Place)
12 pages
CV (Durst)
No ratings yet
CV (Durst)
6 pages
Decision Making in Football
No ratings yet
Decision Making in Football
46 pages
Garvit Id Bhatia Resume
No ratings yet
Garvit Id Bhatia Resume
1 page
Pre Assessment Tools
No ratings yet
Pre Assessment Tools
4 pages
Bloom's Taxonomy for Assessment Strategies
No ratings yet
Bloom's Taxonomy for Assessment Strategies
7 pages
B2 First For Schools Reading and Use of English Sample Answer Sheet
100% (1)
B2 First For Schools Reading and Use of English Sample Answer Sheet
2 pages

IR Midsem

Uploaded by

IR Midsem

Uploaded by

2/7/22, 8:34 AM IR Midsem

[email protected] Switch account

Your email will be recorded when you submit this form

Minimum how many copies of data are maintained in HDFS ? 1 point

doc1, doc2 , doc3

doc2, doc3, doc1

doc3, doc2, doc1

doc1, doc3, doc2

In logarithmic merge, where n=3. We have 47 tokens to be processed. Find 2 points

Sorts and writes to a posting list.

Writes pairs into k partitions, where k ∈ N.

Reads document at a time and emits a pair.

Assigns a split into an idle machine.

Collects all pairs for one partition

and substitution only).

maintaing checksum of data

Which of the following can not run on HDFS? 1 point

Real time processing is also called as 2 points

Processing group of events less than minute

Per day processing

Per event processing

Per hour processing

In which launguage MapReduce is written ? 1 point

suitable correct word. (Dictionary contains only candidate words)

Which of following is not a data ingestion tool? 2 points

What is purpose of Namenode ? 2 points

None of the above

min( |s1| , |s2| )

max( |s1| , |s2| )

Can the tf-idf weight of term in a document exceed 1? 1 point

proportional to their popularity. You want to calculate the similarity

Paragraph for the next 3 questions

Which of the following is/are true? 2 points

D2 has better/larger score than D1

Whichever has better score, it is by a low margin (|difference| <= 0.02)

Whichever has better score, it is by a high margin (|difference| > 0.02)

D1 has better/larger score than D2

The ranking order remains the same

The ranking order reverses

The ranking order reverses

The ranking order remains the same

Paragraph for the next 2 questions

The log-odds ratio for term ‘a’ is 0.845

The log-odds ratio for term ‘c’ is -0.14

The log-odds ratio for term ‘a’ is 0.645

The log-odds ratio for term ‘b’ is -0.22

A copy of your responses will be emailed to [email protected].

Submit Clear form

This form was created inside of IIIT Delhi. Report Abuse

You might also like