Measuring Similarity Between Question Pair in Online Forums: 1 Pramod Kumar Rai 2 Kunal Chakma

This document discusses methods for measuring similarity between questions posted on online forums. It describes calculating similarity at the word, sentence, and document levels. Word similarity can be measured through spelling or meaning using techniques like edit distance. Sentence similarity considers word similarity and order. Document similarity is based on keyword and vector similarities. The document also reviews related work on word co-occurrence, lexical databases, search engine results, n-gram similarity, Levenshtein distance, and Jaro distance for measuring text similarity.

Uploaded by

Pramod Kumar Rai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

108 views5 pages

Measuring Similarity Between Question Pair in Online Forums: 1 Pramod Kumar Rai 2 Kunal Chakma

Uploaded by

Pramod Kumar Rai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

MEASURING SIMILARITY BETWEEN

QUESTION PAIR IN ONLINE FORUMS

1st Pramod Kumar Rai 2nd Kunal Chakma
Dept. of Computer Science and Engineering Dept. of Computer Science and Engineering
National Institute of Technology Agartala National Institute of Technology Agartala
Agartala,India Agartala,India
[email protected] [email protected]

Abstract—Two questions asking the same thing can have classification of texts as well as documents, we can get the
different set of vocabulary set and syntactic structure which best relevant documents corresponding to what user wants.
makes detecting the semantic equivalence between the sentences Several levels of similarities exist in natural languages. It
challenging. In online user forums like Quora, Stack Overflow,
Stack Exchange etc. it is important to maintain high quality may be word-level, sentence level, phrase-level and document-
knowledge base by ensuring each unique question exists only level. Words can be classified into parts of speech like noun,
once. Writers shouldn’t have to write the same answer to each pronoun, adjective , adverb etc. or it may be classified into
of the similar question and the reader must get a single page of synonyms and antonyms which is based on the similarity
the question they are looking for. For example, consider questions between words. Similarities between documents are the basis
like What are the best ways to lose weight?, How can a person
reduce weight?, and What are effective weight loss plans? to be of text classification as well as text clustering. Sentence simi-
duplicate questions because they all have the same intent. The larity is between the word similarity and document similarity.
work presented in this report is an attempt to study, analyze and Similarities calculating methods vary in different levels.
accordingly propose methods for finding similarity of questions
posted on online forums. A. Word similarity
Index Terms—Text Similarity, Text Mining, NLP
Similarity between words can be calculated from the
spelling of words or the meaning of words. Edit distance can
I. I NTRODUCTION
be used to measure similarity between words from the spelling
In the last two decades, internet or the web has evolved as of words. If two words are similar in spelling, they are possible
the reservoir of information and its growth is phenomenal. Due similar in meaning in our intuition or it can say that both are
to this, today there are large numbers of texts or documents Synonyms to each other.
available on the web some of which may be redundant or carry
similar information. So, getting the appropriate documents B. Sentence similarity
from the web as per a users requirement is difficult. For The similarities between words in given different sentences
this, there are different techniques available through which we have great impact on the similarity between two sentences.
can retrieve most relevant document from the web. Therefore, Words and their orders in the sentences are important factors
finding similarity between words, sentences, paragraphs and to calculate sentence similarity.
documents is an important part in various tasks such as
retrieving the information from web, text clustering (document C. Document similarity
organization), automatic grade assignment to essays, short The similarity between word and sentences has great im-
answer scoring, machine transformation and summarizing the pact on the similarity between documents. Commonly used
texts. Text similarity means how semantically two or more approaches are often based on similarity between the keyword
documents are close to each other with respect to an infor- sets (e.g., Dice similarity) or similarity between the vectors of
mation requirement. For example, an information requirement keywords (e.g., cosine similarity).
about Apple the fruit and Apple Inc. the company is not
similar, whereas, an information requirement for Apple iPhone II. R ELATED W ORK
and Google Pixel is similar. A users information requirement Works that have been done previously described different
is fulfilled by retrieving those documents that have contents approaches or techniques to measure similarity between short
satisfying the questions or query of users. Such documents text parts or chunks. Measuring similarity between long texts
are considered as the most relevant documents with respect has been used in information retrieval. It considers mainly
to the users queries. Text similarity is also used to categorize on the numerical ,graphical and statistical information of
the texts as well as documents. We can also evaluate the sim- keywords in long texts. Keywords are generally selected on
ilarity between sentences, words, paragraphs and documents the basis of assigning weights schemes. Sentence similarity
to classify them in an appropriate way. On the basis of these has been used in machine translation, translation memory, text
summarization , text categorization, question answering and B. N-gram Similarity
even image search on the Web . If there is a sequence of text is given, the N-gram is a
Related works can roughly be classified into following technique through which we can determine the similarity of
major categories: sub-sequence of n items from given text sequence[1]. In N-
gram similarity technique, we compute the similarity on the
A. Word co-occurrence methods basis of how much distance is their between character in two
strings. if- N-gram of size 1 is given then it is called - unigram
It is mainly used for Information Retrieval (IR) systems. In
N-gram of size 2 is given then it is called - bigram
this method we take list of meaningful words and take every
N-gram of size 3 is given then it is called - trigram
query as a document. Now a vector is defined for the query and
For Example, for the sentence We are going to school.
for documents (From large corpus). The most appropriate texts
If we take N=2 ( bigrams), then the n-grams would be:
or documents are retrieved based on the how much similarties
We are,are going,going to,to school.
are their in between query vector and document vector.
C. Levenshtein distance Similarity
B. Similarity based on a lexical database The Levenshtein distance technique use the distance aspect
Using the lexical database methodology, measuring the to measure the similarity between given two string. In reality,
similarity by using word hierarchy which is predefined .In this this distance is counting the minimum number of opera-
hierarchy words, meaning, and relationship with other words tion(insertion,deletion,an substitution) needed to transform one
are stored in a tree-like structure[15] .But when we comparing string into other string. For example. the Levenshtein distance
two words, it takes into account the path distance between the between Bitten and sitting is 3. 1. kitten - sitten(substitution
words . of s for B)
2. sitten - sittin(substitution of i for e)
3. sittin - sitting(insertion of g at the end)
C. Method based on web search engine results
The third method calculate the similarity between texts as D. JARO Distance Similarity
relatedness based on web search engine results and utilizes the This algorithm determines the similarity between two
total number of search results [15]. We have implemented the strings on the basis of common charac-ter[1]. Higher the
methodology to calculates the Google Similarity Distance[15] jaro distance for two string ,similarity will be more between
. The search engines that we are used for this methodology strings.when the result is calculated and it gives 0 then their is
are Google, Bing, Ask etc. no similarity between two texts and if result is 1 then it could
be said that it is similar.
III. L EXICAL S IMILARITY METHODS USED FOR MEASURE
T EXT SIMILARITY

Lexical similarity is a method of measuring similarity to

measure the number of sets of two words that are similar to the
two given chains. It is said that a lexical similarity is totally
coincident if it gives the result 1, while it is said that the
lexical similarity has no agreement when it gives the result 0.
This similarity method verifies the similarity between two texts
based on the character. for example IIT and NIT are lexical
similar to each other because their sequence of characters is
approximately in the same order.

A. Longest Common Subsequence similarity

LCS matching technique is a commonly used technique to E. Cosine Similarity
measure the similarity between two strings .It is the way to It is a algorithm through which we can find the similarity
measure the similarity of two or more sequences is to find between two texts. We represent these texts in the form of
their longest common sub-sequences. LCS of two sequences is non-zero vectors for measuring similarities. Cosine similarity
a subsequence of maximum possible length ,which is common is a similarity function that is often used in Information
to both sequences . For Example: Retrieval. In the case of IR , it measure the angle between
s1=can i transfer my wallet balance to my bank account two documents. If vector is higher in case of cosine similarity
s2=can i transfer my bank balance to my wallet between two documents, then both the documents have more
then the LCS in the sequence is can i transfer my balance number of words in common and if the vector is lower then
to my less number of common words has to be found.
F. Jaccard Similarity the strength of Association of the corresponding words of row
This similarity algorithm is also known as intersection over and column. Like the text is analyzed, the focused word is
union. It is used to find similarity between words. If two sets A selected and compared with the close words that are called
and B are given and we have to calculated Jaccard Similarity, co-occurring [2]. the co-occurrence is inversely proportional
then take the intersection of given two sets and and devides to the distance from the word of focus and the values are in
by their union of given two sets. the matrix.

IV. S EMANTIC S IMILARITY METHODS USED FOR C. Semantic Similarity using Web Search Engines
MEASURE T EXT SIMILARITY
This approach uses the results of web search to find a
Semantic resemblance between concepts is a method for semantic relationship between two words. For the semantic
measuring semantic similarity or Semantic distance between relationship, use the page. Counting and fragments provided
two concepts according to a given ontology [6]. In other by the web search. Suppose, likeness between two words
words, semantics. Similarity is used to detect the common A and B you have to find out. The method search A and
characteristics between certain concepts or documents. Se- B separately and find the number of pages. In In the same
mantic similarity methods are used intensively for most ap- way, the combined query A and B and the page are searched
plications Search systems for semantic and knowledge-based for the count is determined. The fragments are used to find
information (identify an optimal match between query terms the term. Frequencies of A and B. The observed models
and documents) [6]. The semantic similarity and the semantic are classified. According to his ability to co-relate different
relationship are two. Related words, but the semantic similarity words semantically. The similarity scores of page counts and
is more specific than the relationship and can be considered as code fragments are integrated. together using machine vector
a type of semantic relationship [6]. For example, student and support to evaluate Semantic resemblance between words.
teacher are relative terms, that are not similar The semantic
similarity and the semantic distance are defined in reverse. D. KNOWLEDGE BASED SIMILARITY
Sean s1 and s2 two concepts that belong to two different Ontology, taxonomies and semantic networks are knowl-
nodes n1 and n2 in a given ontology, the distance between edge. forms of representation that are used in the recovery
The nodes (n1 and n2) determine the similarity between these of information e these representations of knowledge are used
two concepts s1 and s2 [6]. both of them n1 and n2 can with various methods for Find a semantic similarity between
be considered as an ontology (also called concept node) that different terms or concepts. Similarity based on knowledge is
contains a set of Synonym of terms and consequently [6]. one of the semantic similarities. Measure that uses information
Two terms are synonymous if they are in the same position derived from semantic networks. to identify the degree of sim-
The node and its semantic similarity are maximized [6]. ilarity between words [2].WordNet is The semantic network
A. Latent Semantic Analysis (LSA) widely used. It is a large lexical database of English in which
verbs, nouns, adverbs and adjectives are grouped together. in
The formal definition of LSA is that the psychological
a similar sense, it sets notes as synsets. And these synsets are
similarity between any two words is reflected in the way they
Connected to each other by semantic and lexical conceptual
occur together in small sub-samples of language [8]. In this
concepts. relations. Knowledge-based similarity measures can
method we take a matrix in which words are put as rows
be classified as Measures of semantic similarity and measures
and contexts as columns [8]. Contexts can be anything, for
of semantics.
example, journalistic articles, textbooks or student essays and
words are simply those that appear in the training set [8]. It is V. DATASET
important to underline that the contexts with which the model
is provided will determine the types of words with which it has In my dataset 400000 lines of question-pair available and
experience, so that the training set must be relevant to the task each question contain their Id number . In the final section
that the model must perform [8]. The first step is to associate a binary value present in the form of 0 and 1 that indicates
each word with the contexts in which it is likely to appear whether the question pair is similar or not.
[8]. In addition to recording the frequency with which a given The first public data set in Quora is related to the problem
word appears in certain texts, the model evaluates the items to of identifying duplicate questions. In Quora, an important
reflect the diagnostic ability of a word for a given context [8]. principle of the product is that there must be a page of
For example, a word that appears in a large number of very unambiguous questions for each different logical question. For
different contexts is not diagnostic as a word that occurs less example, the questions What is the most populous state in the
frequently and only in a small set of similar contexts [8] United States? And What is the state with the most people in
the United States? They should not exist separately in Quora
B. Hyper-space Analogue to Language(HAL) because the intention of both is identical. Having a canonical
Use the word co-occurrences to form a semantic space. page for each different query makes knowledge sharing more
In this An array is created in which each row and column efficient in many ways: for example, knowledge researchers
represent both the word Each element of the matrix represents can access all the answers to a question in one place and
the authors can reach a number a greater number of readers 1) Results: In this table I have taken two questions from the
compared to if the public were divided into several pages. dataset that I have taken and measure the similarity between
The dataset is based on actual data from Quora and will give these two questions by applying Jaccard similarity methods
anyone the opportunity to train and test models of semantic
equivalence. similarity.png

A. Some Quora questions pair Dataset which represent binary

value

question.png

VII. C ONCLUSION
VI. E XPERIMENT A ND R ESULT Measuring the similarity between words, sentences, doc-
uments and concepts is an important parts in various tasks
A. Measuring Similarity between two sentences using Co- sine such as information retrieval, automatic essay scoring , short
similarity method answer grading, document clustering , machine translation,
It is a algorithm through which we can find the similarity web mining and text summarization using different similarity
between two texts. We represent these texts in the form of techniques. Till now i have used two techniques to check
non-zero vectors for measuring similarities. Cosine similarity similarity between two sentences, these techniques are Cosine
is a similarity function that is often used in Information similarity and Jaccard similarity.
Retrieval. In the case of IR , it measure the angle between two
VIII. R EFERENCES
documents. It is normally used in the context of text mining
for comparing documents or emails. If vector is higher in case [1] Pradhan, Nitesh, Manasi Gyanchandani, and Rajesh Wad-
of cosine similarity between two documents, then both the hvani(2015), A Review on Text Similarity Technique used
documents have more number of words in common[17]. in IR and its Application. International Journal of Computer
1) Results: In this table I have taken two questions from the Applications 120.9 .
dataset that I have taken and measure the similarity between [2] Gomaa, Wael H., and Aly A. Fahmy.(2013), A survey of
these two questions by applying Cosine similarity methods. text similarity approaches. International Journal of Computer
Applications 68.13 : 13-18.
similarity .png [3] Gupta, Aditi, et al.(2017), A Survey on Semantic Simi-
larity Measures. IJIRST-International Journal for Innovative
Research in Science Technology 3 : 12.
[4] 4Rensch CR.(1992), Calculating lexical similarity. Windows
on bilingualism. 13-5.
[5] Mihalcea R, Corley C, Strapparava C.(2006), Corpus-based
and knowledge-based measures of text semantic similarity.
InAAAI 2006 (Vol. 6, pp. 775-780).
[6] Slimani T.(2013 Oct 30) Description and evaluation of
semantic similarity measures ap-proaches. arXiv preprint
arXiv:1310.8059. .
[7] Zhang J, Sun Y, Wang H, He Y.(2011) Calculating statistical
similarity between sentences. Journal of Convergence Infor-
mation Technology.6(2).
B. Measuring Similarity between two sentences using Jaccard [8] Simmons S, Estes Z(2006). Using latent semantic analysis to
similarity method estimate similarity. InPro-ceedings of the Cognitive Science
This similarity algorithm is also known as intersection over Society (pp. 2169-2173).
union. It is used to find similarity between words. Web Jaccard [9] Ramaprabha J, Das S, Mukerjee P.(2018), Survey on Sen-
coefficient can be computed based on number of elements in tence Similarity Evaluation using Deep Learning. InJournal of
the intersection set divided by the number of element in the Physics: Conference Series (Vol. 1000, No. 1, p. 012070). IOP
union set[1]. Publishing.
[10] http://www.stokastik.in/dynamic-programming-in-natural-
language-processing-longest-common-subsequence/
[11] https://dataconomy.com/2015/04/implementing-the-five-
most-popular-similarity- measures-in-python
[12] https://www.wikipedia.org/
[13] https://ai.googleblog.com/2018/05/advances-in-semantic-
textual-similarity.htm l
[14] Achananuparp P, Hu X, Shen X. The evaluation of sentence
similarity measures. InIn-ternational Conference on data ware-
housing and knowledge discovery 2008 Sep 2 (pp. 305-316).
Springer, Berlin, Heidelberg.
[15] Pawar A, Mago V. Calculating the similarity between words
and sentences using a lexical database and corpus statistics.
arXiv preprint arXiv:1802.05667. 2018 Feb
[16] https://www.listendata.com/2015/09/text-mining-basicsl
[17] https://github.com/tim5go/quora-question-pairs

Semantic Similarity
No ratings yet
Semantic Similarity
14 pages
Sentence Similarity Based On Semantic Networks
No ratings yet
Sentence Similarity Based On Semantic Networks
36 pages
A Survey of Numerous Text Similarity Approach
No ratings yet
A Survey of Numerous Text Similarity Approach
10 pages
Measurement of Semantic Text Similarity
No ratings yet
Measurement of Semantic Text Similarity
13 pages
Short Text Similarity Calculation Based On Jaccard and Semantic Mixture
No ratings yet
Short Text Similarity Calculation Based On Jaccard and Semantic Mixture
9 pages
Evaluating of Efficacy Semantic Similarity Methods
No ratings yet
Evaluating of Efficacy Semantic Similarity Methods
8 pages
AAAI06-123 (Revisar para Referencias)
No ratings yet
AAAI06-123 (Revisar para Referencias)
6 pages
Evolution of Semantic Similarity - A Survey
No ratings yet
Evolution of Semantic Similarity - A Survey
35 pages
Finding The Similarity Between Two Arabic Texts
No ratings yet
Finding The Similarity Between Two Arabic Texts
12 pages
A Comparative Analysis of Temporal Long Text Similarity: Application To Financial Documents
No ratings yet
A Comparative Analysis of Temporal Long Text Similarity: Application To Financial Documents
15 pages
8-Measuring Text Similarity Based On Structure and Word Embedding
No ratings yet
8-Measuring Text Similarity Based On Structure and Word Embedding
20 pages
Alshammari 2023 Ijca 922667
No ratings yet
Alshammari 2023 Ijca 922667
4 pages
Measuring Semantic Similarity Between Words and Improving Word Similarity by Augumenting PMI
No ratings yet
Measuring Semantic Similarity Between Words and Improving Word Similarity by Augumenting PMI
5 pages
Semantic Similarity For English and Arabic Texts: A Review: Alzahrani 2016
No ratings yet
Semantic Similarity For English and Arabic Texts: A Review: Alzahrani 2016
29 pages
Text Similarity Algorithms Guide
No ratings yet
Text Similarity Algorithms Guide
28 pages
Similarity Metrics Guide
No ratings yet
Similarity Metrics Guide
13 pages
Comparable Evaluation of Contemporary Corpus-Based and Knowledge-Based Semantic Similarity Measures of Short Texts
No ratings yet
Comparable Evaluation of Contemporary Corpus-Based and Knowledge-Based Semantic Similarity Measures of Short Texts
7 pages
A Survey On Semantic Similarity Measures
No ratings yet
A Survey On Semantic Similarity Measures
5 pages
Measure Term Similarity Using A Semantic Network Approach
No ratings yet
Measure Term Similarity Using A Semantic Network Approach
5 pages
Format Synopsis DP
No ratings yet
Format Synopsis DP
12 pages
A Web-Based Kernel Function For Measuring The Similarity of Short Text Snippets
No ratings yet
A Web-Based Kernel Function For Measuring The Similarity of Short Text Snippets
10 pages
Document Similarity Algorithms
No ratings yet
Document Similarity Algorithms
10 pages
NLP Proj
No ratings yet
NLP Proj
13 pages
Published Paper
No ratings yet
Published Paper
12 pages
NLP: Understanding Vector Semantics
No ratings yet
NLP: Understanding Vector Semantics
51 pages
Semantic Similarity in Words
No ratings yet
Semantic Similarity in Words
10 pages
Lexical Text Similarity in NLP
No ratings yet
Lexical Text Similarity in NLP
16 pages
Volume 2 Issue 6 2016 2020
No ratings yet
Volume 2 Issue 6 2016 2020
5 pages
Web Search Engine
No ratings yet
Web Search Engine
39 pages
10 1002@cpe 5971
No ratings yet
10 1002@cpe 5971
17 pages
A Novel Hybrid Methodology of Measuring
No ratings yet
A Novel Hybrid Methodology of Measuring
10 pages
20AMSPCSIC01
No ratings yet
20AMSPCSIC01
12 pages
Contextual Document Similarity For Content-Based Literature Recommender Systems
No ratings yet
Contextual Document Similarity For Content-Based Literature Recommender Systems
8 pages
A Web Search Engine-Based Approach To Measure Semantic Similarity Between Words
No ratings yet
A Web Search Engine-Based Approach To Measure Semantic Similarity Between Words
14 pages
State of The Art Document Clustering Algorithms Based On Semantic Similarity
No ratings yet
State of The Art Document Clustering Algorithms Based On Semantic Similarity
18 pages
A Survey of Text Similarity Approaches: Wael H. Gomaa Aly A. Fahmy
No ratings yet
A Survey of Text Similarity Approaches: Wael H. Gomaa Aly A. Fahmy
6 pages
Text Similarity Measures in News Articles by Vector Space Model Using NLP
No ratings yet
Text Similarity Measures in News Articles by Vector Space Model Using NLP
10 pages
Measuring Semantic Similarity Between Words Using Web Search Engines
No ratings yet
Measuring Semantic Similarity Between Words Using Web Search Engines
10 pages
Kurniawan 2018 IOP Conf. Ser.: Mater. Sci. Eng. 403 012074
No ratings yet
Kurniawan 2018 IOP Conf. Ser.: Mater. Sci. Eng. 403 012074
10 pages
Document Similarity Using Term Frequency-Inverse Document Frequency Representation and Cosine Similarity
No ratings yet
Document Similarity Using Term Frequency-Inverse Document Frequency Representation and Cosine Similarity
5 pages
Semantic Kernel for Text Classification
No ratings yet
Semantic Kernel for Text Classification
19 pages
02 Word Clustering
No ratings yet
02 Word Clustering
23 pages
Enhancing Sentence Similarity with Cosine
No ratings yet
Enhancing Sentence Similarity with Cosine
16 pages
Compositional Word Relation Study
No ratings yet
Compositional Word Relation Study
33 pages
Assignment No. 2: Similarity and Dissimilarity Measures
No ratings yet
Assignment No. 2: Similarity and Dissimilarity Measures
11 pages
Text-To-Text Semantic Similarity For Automatic Short Answer Grading
No ratings yet
Text-To-Text Semantic Similarity For Automatic Short Answer Grading
9 pages
Minimum Edit Distance
No ratings yet
Minimum Edit Distance
5 pages
Exposure of Document
No ratings yet
Exposure of Document
5 pages
A Web Search Engine
No ratings yet
A Web Search Engine
3 pages
CS 3308 Learning Journal Unit 4
No ratings yet
CS 3308 Learning Journal Unit 4
5 pages
10 Intro Vses & Tfidf
No ratings yet
10 Intro Vses & Tfidf
56 pages
Measuring Semantic Similarity Between Words Using Web Search Engines
No ratings yet
Measuring Semantic Similarity Between Words Using Web Search Engines
10 pages
Reference Material For NLP - 1
No ratings yet
Reference Material For NLP - 1
40 pages
Consistency and Structure Analysis of Scholarly Papers Using Based On Natural Language Processing
No ratings yet
Consistency and Structure Analysis of Scholarly Papers Using Based On Natural Language Processing
18 pages
Research Article
No ratings yet
Research Article
11 pages
Active CV Dr. Panagiotis Vagianas
No ratings yet
Active CV Dr. Panagiotis Vagianas
5 pages
Ieee Paper 2
No ratings yet
Ieee Paper 2
6 pages
Knowledge Needs and The Savvy Child Teenager Perspectives On Banning Food Marketing To Children
No ratings yet
Knowledge Needs and The Savvy Child Teenager Perspectives On Banning Food Marketing To Children
14 pages
Sociology & Sociological Imagination
No ratings yet
Sociology & Sociological Imagination
14 pages
Kotak Mahindra Bank Alwar Account Summary
No ratings yet
Kotak Mahindra Bank Alwar Account Summary
5 pages
Comprehensive List of Dating Sites
No ratings yet
Comprehensive List of Dating Sites
15 pages
Actifed: Cold & Allergy Relief History
No ratings yet
Actifed: Cold & Allergy Relief History
7 pages
Unit 2
No ratings yet
Unit 2
9 pages
Sample Economics Assignment
100% (3)
Sample Economics Assignment
9 pages
Ben 10 - Omniverse - Ben 10 Wiki - Fandom
No ratings yet
Ben 10 - Omniverse - Ben 10 Wiki - Fandom
13 pages
Lecture 5
No ratings yet
Lecture 5
34 pages
Hino 300 Series 714 Specifications
No ratings yet
Hino 300 Series 714 Specifications
1 page
13 Patrimonio v. Gutierrez, G.R. No. 187769, June 4, 2014
No ratings yet
13 Patrimonio v. Gutierrez, G.R. No. 187769, June 4, 2014
21 pages
BBB 2025 FRR Secondary 2 Set 2 Solution Manual
No ratings yet
BBB 2025 FRR Secondary 2 Set 2 Solution Manual
16 pages
Wto Study Guide - Gcmun 24
No ratings yet
Wto Study Guide - Gcmun 24
17 pages
Top Indian Institutes of Technology & NITs
No ratings yet
Top Indian Institutes of Technology & NITs
136 pages
Reflective Journal
No ratings yet
Reflective Journal
2 pages
B.Tech (CSE) Semester VII: Syllabus of B. Tech (CSE) - College of Computing Sciences &IT, TMU Moradabad
No ratings yet
B.Tech (CSE) Semester VII: Syllabus of B. Tech (CSE) - College of Computing Sciences &IT, TMU Moradabad
1 page
DHIE Portfolio
No ratings yet
DHIE Portfolio
5 pages
Berjaya Food Berhad-2016 Annual Report (Part 1)
No ratings yet
Berjaya Food Berhad-2016 Annual Report (Part 1)
40 pages
Study Notes Consumer Protection Act, 2019
No ratings yet
Study Notes Consumer Protection Act, 2019
40 pages
TESOL Methods for Experienced Teachers
No ratings yet
TESOL Methods for Experienced Teachers
3 pages
Analyzing Tolstoy's "The Two Brothers"
No ratings yet
Analyzing Tolstoy's "The Two Brothers"
7 pages
Oprah: Media Icon & Philanthropist
No ratings yet
Oprah: Media Icon & Philanthropist
1 page
Legal Termination Due to Redundancy
No ratings yet
Legal Termination Due to Redundancy
8 pages
Tfs-Check Point
No ratings yet
Tfs-Check Point
2 pages
Nayankara System
No ratings yet
Nayankara System
7 pages
Ch-02: Kinematics
No ratings yet
Ch-02: Kinematics
17 pages
Mindmap Sustainable Supply Chains
No ratings yet
Mindmap Sustainable Supply Chains
1 page

Measuring Similarity Between Question Pair in Online Forums: 1 Pramod Kumar Rai 2 Kunal Chakma

Uploaded by

Measuring Similarity Between Question Pair in Online Forums: 1 Pramod Kumar Rai 2 Kunal Chakma

Uploaded by

MEASURING SIMILARITY BETWEEN

QUESTION PAIR IN ONLINE FORUMS

Lexical similarity is a method of measuring similarity to

A. Longest Common Subsequence similarity

A. Some Quora questions pair Dataset which represent binary

You might also like