Parsing Conc

Uploaded by

gamtamara

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views9 pages

Parsing Conc

Uploaded by

gamtamara

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Superior and Efficient Fully Unsupervised Pattern-based Concept

Acquisition Using an Unsupervised Parser

Dmitry Davidov1 Roi Reichart1 Ari Rappoport2

1
ICNC , 2 Institute of Computer Science
Hebrew University of Jerusalem
{[email protected]|roiri@cs|arir@cs}.huji.ac.il

Abstract While handcrafted concept databases (e.g., Word-

Net) are extensively used in NLP, manual compila-
Sets of lexical items sharing a significant
aspect of their meaning (concepts) are fun- tion of such databases is labor intensive, error prone,
damental for linguistics and NLP. Unsuper- and somewhat arbitrary. Hence, for many languages
vised concept acquisition algorithms have and domains great efforts have been made for au-
been shown to produce good results, and are tomated construction of such databases from avail-
preferable over manual preparation of con- able corpora. While language-specific and domain-
cept resources, which is labor intensive, er- specific studies show significant success in develop-
ror prone and somewhat arbitrary. Some ex- ment of concept discovery frameworks, the majority
isting concept mining methods utilize super-
of domains and languages remain untreated. Hence
vised language-specific modules such as POS
taggers and computationally intensive parsers. there is a need for a framework that performs well
for many diverse settings and is as unsupervised and
In this paper we present an efficient fully
unsupervised concept acquisition algorithm language-independent as possible.
that uses syntactic information obtained from Numerous methods have been proposed for seed-
a fully unsupervised parser. Our algorithm based concept extraction where a set of concept pat-
incorporates the bracketings induced by the terns (or rules), or a small set of seed words for each
parser into the meta-patterns used by a sym- concept, is provided as input to the concept acqui-
metric patterns and graph-based concept dis- sition system. However, even simple definitions for
covery algorithm. We evaluate our algorithm
concepts are not always available.
on very large corpora in English and Russian,
using both human judgments and WordNet- To avoid requiring this type of input, a number of
based evaluation. Using similar settings as distributional and pattern-based methods have been
the leading fully unsupervised previous work, proposed for fully unsupervised seed-less acquisi-
we show a significant improvement in con- tion of concepts from text. Pattern-based algorithms
cept quality and in the extraction of multiword were shown to obtain high quality results while be-
expressions. Our method is the first to use ing highly efficient in comparison to distributional
fully unsupervised parsing for unsupervised methods. Such fully unsupervised methods do not
concept discovery, and requires no language-
incorporate any language-specific parsers or taggers,
specific tools or pattern/word seeds.
so can be successfully applied to diverse languages.
However, unsupervised pattern-based methods
1 Introduction
suffer from several weaknesses. Thus they are fre-
Comprehensive lexical resources for many domains quently restricted to single-word terms and are un-
and languages are essential for most NLP applica- able to discover multiword expressions in efficient
tions. One of the most utilized types of such re- and precise manner. They also usually ignore poten-
sources is a repository of concepts: sets of lexical tially useful part-of-speech and other syntactic in-
items sharing a significant aspect of their meanings formation. In order to address these weaknesses,
(e.g., types of food, tool names, etc). several studies utilize language-specific parsing or
tagging systems in concept acquisition. Unfortu- sition of a lexically-defined matrix (by SVD, PCA
nately, while improving results, this heavily affects etc), e.g. (Schütze, 1998; Deerwester et al., 1990).
the language- and domain- independence of such While great effort has been made for improv-
frameworks, and severely impacts efficiency since ing the computational complexity of distributional
even shallow parsing is computationally demanding. methods (Gorman and Curran, 2006), they still re-
In this paper we present a method to utilize the in- main highly computationally intensive in compari-
formation induced by unsupervised parsers in an un- son to pattern approaches (see below), and most of
supervised pattern-based concept discovery frame- them do not scale well for very large datasets.
work. With the recent development of fast fully un- The second main approach is to use lexico-
supervised parsers, it is now possible to add parser- syntactic patterns. Patterns have been shown to pro-
based information to lexical patterns while keep- duce more accurate results than feature vectors, at
ing the language-independence of the whole frame- a lower computational cost on large corpora (Pan-
work and still avoiding heavy computational costs. tel et al., 2004). Since (Hearst, 1992), who used a
Specifically, we incorporate the bracketings induced manually prepared set of initial lexical patterns, nu-
by the parser into the meta-patterns used by a sym- merous pattern-based methods have been proposed
metric patterns and graph-based unsupervised con- for the discovery of concepts from seeds. Other
cept discovery algorithm. studies develop concept acquisition for on-demand
We performed a thorough evaluation on two En- tasks where concepts are defined by user-provided
glish corpora (the BNC and a 68GB web corpus) seeds. Many of these studies utilize information ob-
and on a 33GB Russian corpus. Evaluations were tained by language-specific parsing and named en-
done using both human judgments and WordNet, in tity recognition tools (Dorow et al., 2005). Pantel et
similar settings as that of the leading unsupervised al. (2004) reduce the depth of linguistic data used,
previous work. Our results show that utilization of but their method requires POS tagging.
unsupervised parser both improves the assignment TextRunner (Banko et al., 2007) utilizes a set
of single-word terms to concepts and allows high- of pattern-based seed-less strategies in order to ex-
precision discovery and assignment of of multiword tract relational tuples from text. However, this sys-
expressions to concepts. tem contains many language-specific modules, in-
cluding the utilization of a parser in one of the pro-
2 Previous Work cessing stages. Thus the majority of the existing
Much work has been done on lexical acquisition of pattern-based concept acquisition systems rely on
all sorts and the acquisition of concepts in particu- pattern/word seeds or supervised language-specific
lar. Concept acquisition methods differ in the type of tools, some of which are very inefficient.
corpus annotation and other human input used, and Davidov and Rappoport (2006) developed a
in their basic algorithmic approach. Some methods framework which discovers concepts based on high
directly aim at concept acquisition, while the direct frequency words and symmetry-based pattern graph
goal in some is the construction of hyponym (‘is-a’) properties. This framework allows a fully unsuper-
hierarchies. A subtree in such a hierarchy can be vised seed-less discovery of concepts without rely-
viewed as defining a concept. ing on language-specific tools. However, it com-
A major algorithmic approach is to represent pletely ignores potentially useful syntactic or mor-
word contexts as vectors in some space and use dis- phological information.
tributional measures and clustering in that space. For example, the pattern ‘X and his Y’ is useful
Pereira (1993), Curran (2002) and Lin (1998) use for acquiring the concept of family member types,
syntactic features in the vector definition. (Pantel as in “his siblings and his parents’. Without syn-
and Lin, 2002) improves on the latter by clustering tactic information, it can capture noise, as in “... in
by committee. Caraballo (1999) uses conjunction ireland) and his wife)” (parentheses denote syntac-
and appositive annotations in the vector representa- tic constituent boundaries). As another example, the
tion. Several studies avoid requiring any syntactic useful symmetric pattern “either X or Y” can appear
annotation. Some methods are based on decompo- in both good examples (“choose either Chihuahua
or Collie.”) and bad ones (“either Collie or Aus- pects relevant to this paper.
tralian Bulldog”). In the latter case, the algorithm First, it achieves state of the art unsupervised
both captures noise (“Australlian” is now consid- parsing performance: its F-score2 is 75.9% for sen-
ered as a candidate for the ‘dog type’ concept), and tences of up to 10 words from the PennTreebank
misses the discovery of a valid multiword candidate Wall Street Journal corpus (WSJ) (Marcus, 1993),
(“Australlian Bulldog”). While symmetry-based fil- and 59% for sentences of the same length from the
tering greatly reduces such noise, the basic problem German NEGRA (Brants, 1997) corpus. These cor-
remains. As a result, incorporating at least some pora consists of newspaper texts.
parsing information in a language-independent and Second, to obtain good results, manually created
efficient manner could be beneficial. POS tags are used as input in all the unsupervised
Unsupervised parsing has been explored for sev- parsers mentioned above except of Seginer’s, which
eral decades (see (Clark, 2001; Klein, 2005) for re- uses raw sentences as input. (Headden et al., 2008)
cent reviews). Recently, unsupervised parsers have have shown that the performance of algorithms that
for the first time outperformed the right branch- require POS tags substantially decreases when using
ing heuristic baseline for English. These include POS tags induced by unsupervised POS taggers in-
CCM (Klein and Manning, 2002), the DMV and stead of manually created ones. Seginer’s incremen-
DMV+CCM models (Klein and Manning, 2004), tal parser is therefore the only fully unsupervised
(U)DOP based models (Bod, 2006a; Bod, 2006b; parser providing high quality parses.
Bod, 2007), an exemplar based approach (Den- Third, Seginer’s parser is extremely fast. During
nis, 2005), guiding EM using contrastive estimation its initial stage, the parser builds a lexicon. Our Pen-
(Smith and Eisner, 2006), and the incremental parser tium 2.8GHB machines with 4GHB RAM can store
of Seginer (2007) which we use here. These works in memory the lexicon created by up to 0.2M sen-
learn an unlabeled syntactic structure, dependency tences. We thus divided our corpora to batches of
or constituency. In this work we use constituency 0.2M sentences and parsed each of them separately.
trees as our syntactic representation. Note that in this setup parsing quality might be even
Another important factor in concept acquisition better than the quality reported in (Seginer, 2007),
is the source of textual data used. To take advan- since in the setup reported in that paper the parser
tage of the rapidly expanding web, many of the pro- was applied to a few thousand sentences only. On
posed frameworks utilize web queries rather than average, the parsing time of a single batch was 5
local corpora (Etzioni et al., 2005; Davidov et al., minutes (run time did not significantly differ across
2007; Pasca and Van Durme, 2008; Davidov and batches and corpora).
Rappoport, 2009). While these methods have a defi- Parser description. The parser utilizes the novel
nite practical advantage of dealing with the most re- common-cover link representation for syntactic
cent and comprehensive data, web-based evaluation structure. This representation resembles depen-
has some methodological drawbacks such as limited dency structure but unlike the latter, it can be trans-
repeatability (Kilgarriff, 2007). In this study we ap- lated into a constituency tree, which is the syntactic
ply our framework on offline corpora in settings sim- representation we use in this work.
ilar to that of previous work, in order to be able to The parsing algorithm creates the common-cover
make proper comparisons. links structure of a sentence in an incremental man-
ner. This means that the parser reads the words of
3 Efficient Unsupervised Parsing a sentence one after the other and, as each word is
read, it is only allowed to add links that have one of
Our method utilizes the information induced by un-
their ends at that words (and update existing ones).
supervised parsers. Specifically, we make use of the
Words which have not yet been read are not avail-
bracketings induced by Seginer’s parser1 (Seginer,
2
2007). This parser has advantages in three major as- F = 2·R·P
R+P
, where R and P are the recall and precision of
the parsers’ bracketing compared to manually created bracket-
1
The parser is freely available at ing of the same text. This is the accepted measure for parsing
http://staff.science.uva.nl/∼yseginer/ccl performance (see (Klein, 2005)).
able to the parser at this stage. This restriction is bracketings induced by unsupervised parsers in or-
inspired by psycholinguistics research which sug- der to avoid the problems above. We utilize brack-
gests that humans process language incrementally. eting boundaries in our meta-patterns in addition
This results in a significant restriction of the parser’s to HFW and C slots. In other words, their origi-
search space, which is the reason it is so fast. nal meta-patterns are totally lexical, while ours are
During its initial stage the parser builds a lexicon lexico-syntactic meta-patterns. We preserve the at-
containing, for each word, statistics helping the deci- tractive properties of meta-patterns, because both
sion of whether to link that word to other words. The HFWs and bracketings can be found or computed in
lexicon is updated as any new sentence is read. Lex- a language independent manner and very efficiently.
icon updating is also done in an incremental manner Concretely, we define a HFW as a word appearing
so this stage is also very fast. more than TH times per million words, and a C as
a word or multiword expression containing up to 4
4 Unsupervised Pattern Discovery words, appearing less than TC times per million.
In the first stage of our algorithm, we run the unsu- We require that our patterns include two slots for
pervised parser on the corpus in order to produce a C’s, separated by at least a single HFW or bracket.
bracketing structure for each sentence. In the sec- We allow separation by a single bracket because the
ond stage, described here, we use these bracketings lowest level in the induced bracketing structure usu-
in order to discover, in a fully unsupervised manner, ally corresponds to lexical items, while higher levels
patterns that could be useful for concept mining. correspond to actual syntactic constituents.
In order to avoid truncation of multiword expres-
Our algorithm is based on the concept acquisition
sions, we also require the meta pattern to start and
method of (Davidov and Rappoport, 2006). We dis-
end by a HFW or bracket. Thus our meta-patterns
cover patterns that connect terms belonging to the
match the following regular expression:
same concept in two main stages: discovery of pat-
tern candidates, and identification of the symmetric {H|B}∗ C1 {H|B}+ C2 {H|B}∗
patterns among the candidates.
Pattern candidates. A major idea of (Davidov where “*” means zero or more times, and “+” means
and Rappoport, 2006) is that a few dozen high fre- one or more time and B can be “(”,“)” brackets pro-
quency words (HFW) such as ‘and’ and ‘is’ con- duced by the parser (in these patterns we do not
nect other, less frequent content terms into relation- need to guarantee that brackets match properly). Ex-
ships. They define meta-patterns, which are short amples of such patterns include “((C1 )in C2 ))”,
sequences of H’s and C’s, where H is a slot for “(C1 )(such(as(((C2 )”, and “(C1 )and(C2 )”3 . We
a HFW and C is a slot for a content word (later dismiss rare patterns that appear less than TP times
to become a word belonging to a discovered con- per million words.
cept). Their method was shown to produce good Symmetric patterns. Many of the pattern candi-
results. However, the fact that it does not consider dates discovered in the previous stage are not usable.
any syntactic information causes problems. Specif- In order to find a usable subset, we focus on the sym-
ically, it does not consider the constituent structure metric patterns. We define a symmetric pattern as a
of the sentence. Meta-patterns that cross constituent pattern in which the same pair of terms (C words)
boundaries are likely to generate noise – two content is likely to appear in both left-to-right and right-to-
words (C’s) in a meta-pattern that belong to differ- left orders. In order to identify symmetric patterns,
ent constituents are likely to belong to different con- for each pattern we define a pattern graph G(P ), as
cepts as well. In addition, meta-patterns that do not proposed by (Widdows and Dorow, 2002). If term
occupy a full constituent are likely to ‘cut’ multi- pair (C1 , C2 ) appears in pattern P in some context,
word expressions (MWEs) into two parts, one part 3
This paper does not use any punctuation since the parser
that gets treated as a valid C word and one part that is provided with sentences having all non-alphabetic characters
is completely ignored. removed. We assume word separation. C1,2 can be a word or a
The main idea in the present paper is to use the multiword expression.
we add nodes c1 , c2 to the graph and a directed edge computer companies, while the connection of ‘Sun’
EP (c1 , c2 ) between them. In order to select sym- to ‘Earth’ can lead to a concept of celestial bodies.
metric patterns, we create such a pattern graph for
Reducing noise: merging and windowing. Since
every discovered pattern, and create a symmetric
any given term can participate in many cliques, the
subgraph SymG(P) in which we take only bidirec-
algorithm creates overlapping categories, some of
tional edges from G(P ). Then we compute three
which redundant. In addition, due to the nature of
measures for each pattern candidate as proposed by
language and the imperfection of the corpus some
(Davidov and Rappoport, 2006):
noise is obviously to be expected. We enhance the
|{c1 |∃c2 EP (c1 , c2 ) ∧ ∃c3 EP (c3 , c1 )}| quality of the obtained concepts by merging them
M1 (P ) := and by windowing on the corpus. We merge two
|N odes(G(P ))|
concepts Q, R, iff there
T is more than a 50%T overlap
|N odes(SymG(P ))| between them: (|Q R| > |Q|/2) ∧ (|Q R| >
M2 (P ) :=
|N odes(G(P ))| |R|/2). In order to increase concept quality and re-
|Edges(SymG(P ))| move concepts that are too context-specific, we use
M3 (P ) := a simple corpus windowing technique. Instead of
|Edges(G(P ))|
running the algorithm of this section on the whole
For each measure, we prepare a sorted list of all can-
corpus, we divide the corpus into windows of equal
didate patterns. We remove patterns that are not in
size and perform the concept discovery algorithm of
the top ZT (we use 100, see Section 6) in any of the
this section (without pattern discovery) on each win-
three lists, and patterns that are in the bottom ZB in
dow independently. We now have a set of concepts
at least one of the lists.
for each window. For the final set, we select only
5 Concept Discovery those concepts that appear in at least two of the win-
At the end of the previous stage we have a set of dows. This technique reduces noise at the potential
symmetric patterns. We now use them in order to cost of lowering coverage.
discover concepts. The concept discovery algorithm A decrease in the number of windows should pro-
is essentially the same as used by (Davidov and Rap- duce more noisy results, while discovering more
poport, 2006) and has some similarity with the one concepts and terms. In the next section we show that
used by (Widdows and Dorow, 2002). In this section while windowing is clearly required for a large cor-
we outline the algorithm. pus, incorporation of parser data increases the qual-
ity of the extracted corpus to the point where win-
The clique-set method. The utilized approach to
dowing can be significantly reduced.
concept discovery is based on connectivity struc-
tures in the all-pattern term relationship graph G, 6 Results
resulting from merging all of the single-pattern
graphs for symmetric patterns selected in the previ- In order to estimate the quality of concepts and to
ous stage. The main observation regarding G is that compare it to previous work, we have performed
highly interconnected words are good candidates to both automatic and human evaluation. Our basic
form a concept. We find all strong n-cliques (sub- comparison was to (Davidov and Rappoport, 2006)
graphs containing n nodes that are all interconnected (we have obtained their data and utilized their al-
in both directions). A clique Q defines a concept that gorithm), where we can estimate if incorporation of
contains all of the nodes in Q plus all of the nodes parser data can solve some fundamental weaknesses
that are (1) at least unidirectionally connected to all of their framework. In the following description, we
nodes in Q, and (2) bidirectionally connected to at call their algorithm P and our parser-based frame-
least one node in Q. Using this definition, we create work P+. We have also performed an indirect com-
a concept for each such clique. parison to (Widdows and Dorow, 2002).
Note that a single term can be assigned to several While there is a significant number of other re-
concepts. Thus a clique based on a connection of the lated studies4 on concept acquisition (see Section 2),
4
word ‘Sun’ to ‘Microsoft’ can lead to a concept of Most are supervised and/or use language-specific tools.
V W C AS
direct or even indirect comparison to these works is P P+ P P+ P P+
DMOZ 16 330 504 142 130 12.8 16.0
problematic due to difference in corpora, problem BNC 0.3 25 42 9.6 8.9 10.2 15.6
Russ. 10 235 406 115 96 11.6 15.1
definitions and evaluation strategies. Below we de-
scribe the corpora and parameters used in our evalu- Table 1: Results for concept discovery with (P+) and
ation and then show and discuss WordNet-based and without (P) utilization of parsing data. V is the total num-
Human evaluation settings and results. ber (millions) of different words in the corpus. W is the
number (thousands) of words belonging to at least one of
Corpora. We performed in-depth evaluation in the terms for one of the concepts. C is the number (thou-
two languages, English and Russian, using three sands) of concepts (after merging and windowing). AS
corpora, two for English and one for Russian. is the average(words) category size.
The first English corpus is the BNC, containing
about 100M words. The second English corpus, covered by discovered concepts raises nearly 1.5-
DMOZ(Gabrilovich and Markovitch, 2005), is a fold when we utilize patterns based on parsing data
web corpus obtained by crawling URLs in the Open in comparison to pure HFW patterns used in previ-
Directory Project (dmoz.org), resulting in 68GB ous work. We can also see nearly the same increase
containing about 8.2G words from 50M web pages. in average concept size. At the same time we ob-
The Russian corpus (Davidov and Rappoport, 2006) serve about 15% reduction in the total number of
was assembled from web-based Russian reposito- discovered concepts.
ries, to yield 33GB and 4G words. All of these cor-
There are two opposite factors in P+ which may
pora were also used by (Davidov and Rappoport,
influence the number of concepts, their size and cov-
2006) and BNC was used in similar settings by
erage in comparison to P. On one hand, utilization of
(Widdows and Dorow, 2002).
more restricted patterns that include parsing infor-
Algorithm parameters. The thresholds mation leads to a reduced number of concept term
TH , TC , TP , ZT , ZB , were determined mostly instances being discovered. Thus, the P+ pattern “(X
by practical memory size considerations: we com- (or (a Y))” will recognize “(TV (or (a movie))” in-
puted thresholds that would give us the maximal stance and will miss “(lunch) or (a snack))”, while
number of terms, while enabling the pattern access the P pattern “X or a Y” will capture both. This leads
table to reside in main memory. The resulting to a decrease in the number of discovered concepts.
numbers are 100, 50, 20, 100, 100. Corpus window On the other hand, P+ patterns, unlike P ones, al-
size was determined by starting from a small low the extraction of multiword expressions5 , and
window size, extracting at random a single window, indeed more than third of the discovered terms us-
running the algorithm, and iterating this process ing P+ were MWEs. Utilization of MWEs not only
with increased ×2 window sizes until reaching a allows to cover a greater amount of different words,
desired vocabulary concept participation percentage but also increases the number of discovered concepts
(before windowing) (i.e., x% of the different words since new concepts can be found using cliques of
in the corpus participate in terms assigned into newly discovered MWEs. From the results, we can
concepts. We used 5%.). We also ran the algorithm see that for a given concept size and word coverage,
without windowing in order to check how well the the ability to discover MWEs overcomes the disad-
provided parsing information can help reduce noise. vantage of ignoring potentially useful concepts.
Among the patterns discovered are the ubiquitous
Human judgment evaluation. Our human judge-
ones containing “and”,“or”, e.g. ‘((X) or (a Y))’,
ment evaluation closely followed the protocol (Davi-
and additional ones such as ‘from (X) to (Y)’.
dov and Rappoport, 2006).
Influence of parsing data on number of discov- We used 4 subjects for evaluation of the English
ered concepts. Table 1 compares the concept ac- 5
While P method can potentially be used to extract MWEs,
quisition framework with (P+) and without (P) uti-
preliminary experimentation shows that without significant
lization of parsing data. modification, quality of MWEs obtained by P is very low in
We can see that the amount of different words comparison to P+
concepts and 4 subjects for Russian ones. In order The first part of Table 2 gives the average per-
to assess subjects’ reliability, we also included ran- centage of triplets that were given scores of 1 or 2
dom concepts (see below). The goal of the exper- (that is, ‘significant shared meaning’). The second
iment was to examine the differences between the part gives the average score of a triplet (1 is best).
P+ and P concept acquisition frameworks. Subjects In these lines scores of 5 were not counted. Inter-
were given 50 triplets of words and were asked to evaluator Kappa between scores are 0.68/0.75/0.76
rank them using the following scale: (1) the words for DMOZ, BNC and Russian respectively. We can
definitely share a significant part of their meaning; see that terms selected by P and skipped by P+
(2) the words have a shared meaning but only in receive low scores, at the same time even single-
some context; (3) the words have a shared mean- word terms selected by P+ and skipped by P show
ing only under a very unusual context/situation; (4) very high scores. This shows that using parser data,
the words do not share any meaning; (5) I am not the proposed framework can successfully avoid se-
familiar enough with some/all of the words. lection of erroneous terms, while discovering high-
The 50 triplets were obtained as follows. We have quality terms missed by P. We can also see that P+
randomly selected 40 concept pairs (C+,C): C+ in performance on MWEs, while being slightly infe-
P+ and C in P using five following restrictions: (1) rior to the one for single-word terms, still achieves
concepts should contain at least 10 words; (2) for results comparable to those of single-word terms.
a selected pair, C+ should share at least half of its Thus our algorithm can greatly improve the re-
single-word terms with C, and C should share at sults not only by discovering of MWEs but also by
least half of its words with C+; (3) C+ should con- improving the set of single word concept terms.
tain at least 3 MWEs; (4) C should contain at least 3
words not appearing in C+; (5) C+ should contain at WordNet-based evaluation. The major guideline
least 3 single-word terms not appearing in C. in this part of the evaluation was to compare our re-
sults with previous work (Davidov and Rappoport,
These restrictions allow to select concept pairs
2006; Widdows and Dorow, 2002) without the pos-
such that C+ is similar to C while they still carry
sible bias of human evaluation. We have followed
enough differences which can be examined. We se-
their methodology as best as we could, using the
lected the triplets as following: for pairs (C+, C) ten
same WordNet (WN) categories and the same cor-
triplets include terms appearing in both C+ and C
pora. This also allows indirect comparison to several
(Both column in Table 2), ten triplets include single-
other studies, thus (Widdows and Dorow, 2002) re-
word terms appearing in C+ but not C (P+ single
ports results for an LSA-based clustering algorithm
column), ten triplets include single-word terms ap-
that are vastly inferior to the pattern-based ones.
pearing in C but not C+ (P column), ten triplets in-
The evaluation method is as follows. We took
clude MWEs appearing in C+ (P+ mwe column) and
the exact 10 WN subsets referred to as ‘subjects’ in
ten triplets include random terms obtained from P+
(Widdows and Dorow, 2002), and removed all multi-
concepts (Rand column).
word items. We then selected at random 10 pairs of
P+ P Both Rand words from each subject. For each pair, we found
mwe single
% shared
the largest of our discovered concepts containing it.
meaning
DMOZ 85 88 68 81 6
The various morphological forms or clear typos of
BNC
Russ.
85
89
90
95
61
70
88
93
0
11
the same word were treated as one in the evaluation.
triplet
score (1-4)
We have improved the evaluation framework for
DMOZ
BNC
1.7
1.6
1.4
1.3
2.5
2.1
1.7
1.5
3.8
4.0
Russian by using the Russian WordNet (Gelfenbey-
Russ. 1.5 1.1 2.0 1.3 3.7 nand et al., 2003) instead of back-translations as
Table 2: Results of evaluation by human judgment of
done in (Davidov and Rappoport, 2006). Prelim-
three data sets. P+ single/mwe: single-word/MWE terms inary examination shows that this has no apparent
existing only in P+ concept; P: single-word terms existing effect on the results.
only in P concept; Both: terms existing in both concepts; For each found concept C containing N words,
Rand: random terms. See text for detailed explanations. we computed the following: (1) Precision: the num-
ber of words present in both C and WN divided by avoiding windowing altogether. Each time we ran-
N ; (2) Precision*: the number of correct words di- domly sampled a set of 100 concepts and tagged (by
vided by N . Correct words are either words that the authors) noisy ones. A concept is considered to
appear in the WN subtree, or words whose entry in be noisy if it has at least 3 words unrelated to each
the American Heritage Dictionary or the Britannica other. Table 4 shows results of this test.
directly defines them as belonging to the given class Reg. Window ×4 Window No windowing
(e.g., ‘murder’ is defined as ‘a crime’). This was P 4 18 33
P+ 4 5 21
done in order to overcome the relative poorness of
WN; (3) Recall: the number of words present in Table 4: Percentage of noisy concepts as a function of
both C and WN divided by the number of words windowing.
in WN; (4) The percentage of correctly discovered
words (according to Precision*) that are not in WN. We can see that while windowing is still essential
Table 3 compares the macro-average of these 10 even with available parser data, using this data we
categories to corresponding related work. We do not can significantly reduce windowing requirements,
Prec. Prec.* Rec. %New
allowing us to discover more concepts from the
DMOZ same data.
P 79.8 86.5 22.7 2.5
P+ 79.5 91.3 28.6 3.7 Timing requirements are modest, considering we
BNC
P 92.76 95.72 7.22 0.4 parsed such large amounts of data. BNC pars-
P+ 93.0 96.1 14.6 1.7
Widdows 82.0 - - - ing took 45 minutes, and the total single-machine
Russian
P 82.39 89.64 20.03 2.1 processing time for the 68Gb DMOZ corpus was
P+ 83.5 92.6 29.6 4.0
4 days6 . In comparison, a state-of-art supervised
Table 3: WordNet evaluation in comparison to P (Davi- parser (Charniak and Johnson, 2005) would process
dov and Rappoport, 2006) and to Widdows(Widdows and the same amount of data in 1.3 years7 .
Dorow, 2002). Columns show average precision, preci-
sion* (as defined in text), recall, and % of new words 7 Discussion
added to corresponding WN subtree. We have presented a framework which utilizes an
efficient fully unsupervised parser for unsupervised
observe apparent rise in precision when comparing pattern-based discovery of concepts. We showed
P+ and P, but we can see significant improvement that utilization of unsupervised parser in pattern ac-
in both recall and precision* for all of three cor- quisition not only allows successful extraction of
pora. In combination with human judgement results, MWEs but also improves the quality of obtained
this suggests that the P+ framework successfully dis- concepts, avoiding noise and adding new terms
covers more correct terms not present in WN. This missed by the parse-less approach. At the same time,
causes precision to remain constant while precision* the framework remains fully unsupervised, allowing
improves significantly. Rise in recall also shows that its straightforward application to different languages
the P+ framework can discover significantly more as supported by our bilingual evaluation.
correct terms from the same data.
This research presents one more step towards the
Windowing requirement. As discussed in Sec- merging of fully unsupervised techniques for lex-
tion 5, windowing is required for successful noise ical acquisition, allowing to extract semantic data
reduction. However, due to the increase in pattern without strong assumptions on domain or language.
quality with parser data, it is likely that less noise While we have aimed for concept acquisition, the
will be captured by the discovered patterns. Hence, proposed framework can be also useful for extrac-
windowing could be relaxed allowing to obtain more tion of different types of lexical relationships, both
data with sufficiently high precision. among concepts and between concept terms.
In order to test this issue we applied our algo- 6
In fact, we used a PC cluster, and all 3 corpora were parsed
rithms on the DMOZ corpus with 3 different win- in 15 hours.
dowing settings: (1) choosing window size as de- 7
Considering the reported parsing rate of 10 sentences per
scribed above; (2) using ×4 larger window; (3) second
References Ilya Gelfenbeyn, Artem Goncharuk, Vladislav Lehelt,
Anton Lipatov, Victor Shilo, 2003. Automatic Trans-
Mishele Banko, Michael J Cafarella , Stephen Soderland, lation of WordNet Semantic Network to Russian Lan-
Matt Broadhead, Oren Etzioni, 2007. Open Informa- guage (in Russian) International Dialog 2003 Work-
tion Extraction from the Web. IJCAI ’07. shop.
Rens Bod, 2006a. An All-Subtrees Approach to Unsu- James Gorman, James R. Curran, 2006. Scaling Distri-
pervised Parsing. ACL ’06. butional Similarity to Large Corpora. COLING-ACL
Rens Bod, 2006b. Unsupervised Parsing with U-DOP. ’06.
CoNLL X. William P. Headden III, David McClosky and Eugene
Rens Bod, 2007. Is the End of Supervised Parsing in Charniak, 2008. Evaluating Unsupervised Part-of-
Sight? ACL ’07. Speech tagging for Grammar Induction. COLING ’08.
Thorsten Brants, 1997. The NEGRA Export Format. Marti Hearst, 1992. Automatic Acquisition of Hy-
CLAUS Report, Saarland University. ponyms from Large Text Corpora. COLING ’92.
Sharon Caraballo, 1999. Automatic Construction of a Adam Kilgarriff, 2007. Googleology is Bad Science.
Hypernym-labeled Noun Hierarchy from Text. ACL Computational Linguistics ’08, Vol.33 No. 1,pp147-
’99. 151. .
Eugene Charniak and Mark Johnson, 2005. Coarse- Dan Klein and Christopher Manning, 2002. A genera-
to-Fine n-Best Parsing and MaxEnt Discriminative tive constituent-context model for improved grammar
Reranking. ACL ’05. induction. Proc. of the 40th Meeting of the ACL.
Alexander Clark, 2001. Unsupervised Language Acqui- Dan Klein and Christopher Manning, 2004. Corpus-
sition: Theory and Practice. Ph.D. thesis, University based induction of syntactic structure: Models of de-
of Sussex. pendency and constituency. ACL ’04.
James R. Curran, Marc Moens, 2002. Improvements in Dan Klein, 2005. The unsupervised learning of natural
Automatic Thesaurus Extraction SIGLEX 02’, 59–66. language structure. Ph.D. thesis, Stanford University.
Dmitry Davidov, Ari Rappoport, 2006. Efficient Un- Hang Li, Naoki Abe, 1996. Clustering Words with the
supervised Discovery of Word Categories using Sym- MDL Principle. COLING ’96.
metric Patterns and High Frequency Words. COLING- Dekang Lin, 1998. Automatic Retrieval and Clustering
ACL ’06. of Similar Words. COLING ’98.
Dmitry Davidov, Ari Rappoport, Moshe Koppel, 2007. Marcus Mitchell P., Beatrice Santorini and Mary Ann
Fully Unsupervised Discovery of Concept-Specific Marcinkiewicz, 1993. Building a large annotated cor-
Relationships by Web Mining. ACL ’07. pus of English: The Penn Treebank. Computational
Dmitry Davidov, Ari Rappoport, 2009. Translation and Linguistics, 19(2):313–330
Extension of Concepts Across Languages. EACL ’09. Marius Pasca, Benjamin Van Durme, 2008. Weakly-
Scott Deerwester, Susan Dumais, George Furnas, supervised Acquisition of Open-domain Classes and
Thomas Landauer, Richard Harshman, 1990. Index- Class Attributes from Web Documents and Query
ing by Latent Semantic Analysis. J. of the American Logs. ACL 08.
Society for Info. Science, 41(6):391–407. Patrick Pantel, Dekang Lin, 2002. Discovering Word
Simon Dennis, 2005. An exemplar-based approach to Senses from Text. SIGKDD ’02.
unsupervised parsing. Proceedings of the 27th Con- Patrick Pantel, Deepak Ravichandran, Eduard Hovy,
ference of the Cognitive Science Society. 2004. Towards Terascale Knowledge Acquisition.
Beate Dorow, Dominic Widdows, Katarina Ling, Jean- COLING ’04.
Pierre Eckmann, Danilo Sergi, Elisha Moses, 2005. Fernando Pereira, Naftali Tishby, Lillian Lee, 1993. Dis-
Using curvature and Markov clustering in Graphs for tributional Clustering of English Words. ACL ’93.
Lexical Acquisition and Word Sense Discrimination. Hinrich Schütze, 1998. Automatic Word Sense Discrim-
MEANING ’05. ination. Computational Linguistics , 24(1):97–123.
Oren Etzioni, Michael Cafarella, Doug Downey, S. Kok, Yoav Seginer, 2007. Fast Unsupervised Incremental
Ana-Maria Popescu, Tal Shaked, Stephen Soderland, Parsing. ACL ’07.
Daniel Weld, Alexander Yates, 2005. Unsupervised Noah A. Smith and Jason Eisner, 2006. Annealing Struc-
Named-entity Extraction from the Web: An Experi- tural Bias in Multilingual Weighted Grammar Induc-
mental Study. Artificial Intelligence, 165(1):91134. tion . ACL ’06.
Evgeniy Gabrilovich, Shaul Markovitch, 2005. Fea- Dominic Widdows, Beate Dorow, 2002. A Graph Model
ture Generation for Text Categorization Using World for Unsupervised Lexical Acquisition. COLING ’02.
Knowledge. IJCAI ’05.

Symbolic Machine Learning for Lexical Knowledge
No ratings yet
Symbolic Machine Learning for Lexical Knowledge
3 pages
Hypernyms Under Siege: Linguistically-Motivated Artillery For Hypernymy Detection
No ratings yet
Hypernyms Under Siege: Linguistically-Motivated Artillery For Hypernymy Detection
11 pages
Semantic Taxonomy Induction Algorithm
No ratings yet
Semantic Taxonomy Induction Algorithm
8 pages
Nominals
No ratings yet
Nominals
9 pages
Compound Noun Semantics Analysis
No ratings yet
Compound Noun Semantics Analysis
167 pages
Klein Thesis
No ratings yet
Klein Thesis
140 pages
Learning Non-Taxonomic Relationships From Web Documents For Domain Ontology Construction
No ratings yet
Learning Non-Taxonomic Relationships From Web Documents For Domain Ontology Construction
24 pages
Unit 6 Endsem PYQs
No ratings yet
Unit 6 Endsem PYQs
15 pages
Approaches To Natural Language Processing
No ratings yet
Approaches To Natural Language Processing
9 pages
10324-Article Text-13852-1-2-20201228
No ratings yet
10324-Article Text-13852-1-2-20201228
7 pages
Analyzing Big Data with Computational Linguistics
No ratings yet
Analyzing Big Data with Computational Linguistics
29 pages
Natural Language Processing
No ratings yet
Natural Language Processing
13 pages
Unit 3, 4 Textbook
No ratings yet
Unit 3, 4 Textbook
83 pages
CACIC 20070725 Induction Trees LopezDeLuise - v7
No ratings yet
CACIC 20070725 Induction Trees LopezDeLuise - v7
12 pages
Word Sense Disambiguation Systems
No ratings yet
Word Sense Disambiguation Systems
4 pages
NLE-Chebel-2021 Copie - Compressed
No ratings yet
NLE-Chebel-2021 Copie - Compressed
24 pages
NLP Unit 3
No ratings yet
NLP Unit 3
20 pages
Mpi I 2006 5 006
No ratings yet
Mpi I 2006 5 006
39 pages
Ontology Learning
No ratings yet
Ontology Learning
4 pages
Learning Semantic Relations with Asium
No ratings yet
Learning Semantic Relations with Asium
20 pages
S10 1
No ratings yet
S10 1
477 pages
Roy Acl 2011
No ratings yet
Roy Acl 2011
10 pages
Unsupervised Semantic Slot Induction Techniques
No ratings yet
Unsupervised Semantic Slot Induction Techniques
6 pages
Automatic Building of An Ontology From A Corpus of Documents
No ratings yet
Automatic Building of An Ontology From A Corpus of Documents
5 pages
Learning To Disambiguate Natural Language Using World
No ratings yet
Learning To Disambiguate Natural Language Using World
8 pages
Semantic Bootstrapping: A Theoretical Perspective: (Extended Abstract)
No ratings yet
Semantic Bootstrapping: A Theoretical Perspective: (Extended Abstract)
2 pages
Unsupervised Semantic Class Induction
No ratings yet
Unsupervised Semantic Class Induction
4 pages
Data Augmentation for Low-Resource NLP
No ratings yet
Data Augmentation for Low-Resource NLP
13 pages
Introduction
No ratings yet
Introduction
11 pages
Sciencedirect: Enhancing D Rule-Based Event Coder With Semantic Vectors
No ratings yet
Sciencedirect: Enhancing D Rule-Based Event Coder With Semantic Vectors
7 pages
An Hybrid Approach To Word Sense Disambiguation
No ratings yet
An Hybrid Approach To Word Sense Disambiguation
12 pages
Wordnet Improves Text Document Clustering: Andreas Hotho Steffen Staab Gerd Stumme
No ratings yet
Wordnet Improves Text Document Clustering: Andreas Hotho Steffen Staab Gerd Stumme
8 pages
LeXical SemAntic TEchniqueS For Corpus AnalYSiS
No ratings yet
LeXical SemAntic TEchniqueS For Corpus AnalYSiS
28 pages
Structure in Linguistics
No ratings yet
Structure in Linguistics
6 pages
Mining Structured From Massive Text Data: A Data-Driven Approach
No ratings yet
Mining Structured From Massive Text Data: A Data-Driven Approach
4 pages
Unsupervised Methods For Developing Taxonomies by Combining Syntactic and Statistical Information
No ratings yet
Unsupervised Methods For Developing Taxonomies by Combining Syntactic and Statistical Information
8 pages
Feature Eng
No ratings yet
Feature Eng
34 pages
Improved Pattern Discovery in Text Mining
No ratings yet
Improved Pattern Discovery in Text Mining
5 pages
Cabré - Introduction - Application-Driven Terminology Engineering
No ratings yet
Cabré - Introduction - Application-Driven Terminology Engineering
19 pages
SemanticsSpeechRecognitionUnderstanding PDF
No ratings yet
SemanticsSpeechRecognitionUnderstanding PDF
11 pages
Driving Semantic Parsing From The World Response
No ratings yet
Driving Semantic Parsing From The World Response
10 pages
A Framework For Figurative Language Detection Based On Sense Differentiation
No ratings yet
A Framework For Figurative Language Detection Based On Sense Differentiation
6 pages
Discovering The Lexical Features of A Language
No ratings yet
Discovering The Lexical Features of A Language
2 pages
English7 Q3 W1 D4
No ratings yet
English7 Q3 W1 D4
44 pages
Semisupervised Data Driven Word Sense... (Pratibha Rani and Others)
No ratings yet
Semisupervised Data Driven Word Sense... (Pratibha Rani and Others)
11 pages
DVT UNIT - 4 Notes 211124
No ratings yet
DVT UNIT - 4 Notes 211124
21 pages
DVT Unit 4
No ratings yet
DVT Unit 4
21 pages
Reference Material NLP - 2
No ratings yet
Reference Material NLP - 2
40 pages
Download
No ratings yet
Download
13 pages
Bagpack: A General Framework To Represent Semantic Relations
No ratings yet
Bagpack: A General Framework To Represent Semantic Relations
8 pages
Understanding Concept Search Techniques
No ratings yet
Understanding Concept Search Techniques
7 pages
Enhancing Czech Parsing With Verb Valency Frames: March 2013
No ratings yet
Enhancing Czech Parsing With Verb Valency Frames: March 2013
13 pages
UWB's ABSA System at SemEval 2016
No ratings yet
UWB's ABSA System at SemEval 2016
8 pages
System Paradigms in NLP
No ratings yet
System Paradigms in NLP
8 pages
Trends in Linguistics
100% (2)
Trends in Linguistics
363 pages
Luận Văn Lexicalized Statistical Parsing for Vietnamese
No ratings yet
Luận Văn Lexicalized Statistical Parsing for Vietnamese
16 pages
Gabbar 2025 Update
No ratings yet
Gabbar 2025 Update
15 pages
Data Mining:: Concepts and Techniques
No ratings yet
Data Mining:: Concepts and Techniques
37 pages
Fas Tenure Report
No ratings yet
Fas Tenure Report
23 pages
Everything PDF
No ratings yet
Everything PDF
4 pages
Jumpstart 2022 Research Cyberinfrastructure at UW
No ratings yet
Jumpstart 2022 Research Cyberinfrastructure at UW
8 pages
Highlights From 2018 Institutional Assessment
No ratings yet
Highlights From 2018 Institutional Assessment
3 pages
CLIA Certification for Research Labs
No ratings yet
CLIA Certification for Research Labs
4 pages
University Charter
No ratings yet
University Charter
24 pages
SOP Department of Defense v2.4 2019.12.13
No ratings yet
SOP Department of Defense v2.4 2019.12.13
9 pages
Yale Trusteeship Committee Charter
No ratings yet
Yale Trusteeship Committee Charter
2 pages
Worksheet Irb Review Outcomes v1.11 2021.12.28
No ratings yet
Worksheet Irb Review Outcomes v1.11 2021.12.28
6 pages
WORKSHEET FDA Drugs IND v1.1 2022.06.30
No ratings yet
WORKSHEET FDA Drugs IND v1.1 2022.06.30
4 pages
Office of Research Leadership Development Leadership People Skills 10.16.20
No ratings yet
Office of Research Leadership Development Leadership People Skills 10.16.20
4 pages
SUPPLEMENT Site Application No Contact v1.01 2022.03.31
No ratings yet
SUPPLEMENT Site Application No Contact v1.01 2022.03.31
4 pages
EXAMPLE Consent Exempt Research v1.00 2022.05.02
No ratings yet
EXAMPLE Consent Exempt Research v1.00 2022.05.02
4 pages
WORKSHEET Prisoners v4.2 2022.10.27
No ratings yet
WORKSHEET Prisoners v4.2 2022.10.27
5 pages
WORKSHEET Pregnant Women v3.2 2023.08.31
No ratings yet
WORKSHEET Pregnant Women v3.2 2023.08.31
5 pages
ORC Phase II Training
No ratings yet
ORC Phase II Training
7 pages
GUIDANCE Mandatory State Reporting v1.01 2021.06.24
No ratings yet
GUIDANCE Mandatory State Reporting v1.01 2021.06.24
4 pages
GUIDANCE HIPAA v2.2 2023.04.27-9
No ratings yet
GUIDANCE HIPAA v2.2 2023.04.27-9
7 pages
GUIDANCE Data Sets Not Requiring HSD or IRB Review
No ratings yet
GUIDANCE Data Sets Not Requiring HSD or IRB Review
4 pages
Intellectual Property and Technology Transfer Guidelines
No ratings yet
Intellectual Property and Technology Transfer Guidelines
1 page
SOP Expanded Access v2.01 2021.10.08
No ratings yet
SOP Expanded Access v2.01 2021.10.08
8 pages
SOP Appeal of IRB or HSD Determination v2.00 2020.06.26
No ratings yet
SOP Appeal of IRB or HSD Determination v2.00 2020.06.26
3 pages
FAQ Good Clinical Practice Training v1.21 2021.06.24
No ratings yet
FAQ Good Clinical Practice Training v1.21 2021.06.24
3 pages
WORKSHEET Department of Justice v3.1 2023.06.01
No ratings yet
WORKSHEET Department of Justice v3.1 2023.06.01
3 pages
Conference Room Instructions For Web Conference 03.06.2020
No ratings yet
Conference Room Instructions For Web Conference 03.06.2020
2 pages
Funding and Facilities Guidelines
No ratings yet
Funding and Facilities Guidelines
9 pages
WORKSHEET Clinical Trial Registration
No ratings yet
WORKSHEET Clinical Trial Registration
9 pages
ESCRO Amendment Form 20july2023
No ratings yet
ESCRO Amendment Form 20july2023
8 pages
Frequently Asked Questions
No ratings yet
Frequently Asked Questions
5 pages
ESCRO Changes
No ratings yet
ESCRO Changes
2 pages
Module 4 BSSW 2B
No ratings yet
Module 4 BSSW 2B
31 pages
English Skills for University Students
No ratings yet
English Skills for University Students
34 pages
Review: Lintas Minat
No ratings yet
Review: Lintas Minat
13 pages
Quirino General High School: Energizers-For-Students
No ratings yet
Quirino General High School: Energizers-For-Students
3 pages
4 April WQ4th - SimplePastPresentPerfPresentPerfCont - 4th Week
100% (1)
4 April WQ4th - SimplePastPresentPerfPresentPerfCont - 4th Week
3 pages
Unit 1a - A Job Worth Doing - Pioneer B1+
100% (1)
Unit 1a - A Job Worth Doing - Pioneer B1+
15 pages
ENGL 101 An Introduction To Academic Discourse T211
No ratings yet
ENGL 101 An Introduction To Academic Discourse T211
3 pages
Cross Cultural Understanding: Aan Pranata (17018106)
No ratings yet
Cross Cultural Understanding: Aan Pranata (17018106)
3 pages
Group 5 - Language and Sex
No ratings yet
Group 5 - Language and Sex
12 pages
Midterm Test 2025
No ratings yet
Midterm Test 2025
3 pages
Plagiarism Report - Grammarly
No ratings yet
Plagiarism Report - Grammarly
1 page
Dilgence (A: (G) He Is One of The Poorest Persons in The Soclety. (Comparative)
No ratings yet
Dilgence (A: (G) He Is One of The Poorest Persons in The Soclety. (Comparative)
1 page
Business Communication Bsci It Sem 2
No ratings yet
Business Communication Bsci It Sem 2
27 pages
Mini Test 8th Grade
No ratings yet
Mini Test 8th Grade
4 pages
ESB B2 2023B Whole Paper
No ratings yet
ESB B2 2023B Whole Paper
15 pages
Grade 7 - Voice of Verbs
No ratings yet
Grade 7 - Voice of Verbs
37 pages
Korean Particle List
No ratings yet
Korean Particle List
18 pages
Understanding English Intonation Patterns
No ratings yet
Understanding English Intonation Patterns
26 pages
Chapter Two
No ratings yet
Chapter Two
23 pages
Listenings Sites
No ratings yet
Listenings Sites
2 pages
Speech Acquisition Summary
No ratings yet
Speech Acquisition Summary
10 pages
Mother Teresa: Birth and Family Facts
No ratings yet
Mother Teresa: Birth and Family Facts
5 pages
Checking Understanding
No ratings yet
Checking Understanding
9 pages
Least Learned Grade 1 and 2
No ratings yet
Least Learned Grade 1 and 2
5 pages
Class 5 Noun - Number
No ratings yet
Class 5 Noun - Number
6 pages
A2 Free Time Activities
No ratings yet
A2 Free Time Activities
4 pages
Health Benefits of Chocolate
No ratings yet
Health Benefits of Chocolate
10 pages
Hig Report Interrogation A Review of The Science September 2016 PDF
No ratings yet
Hig Report Interrogation A Review of The Science September 2016 PDF
94 pages
2017 Year 9 English Destroying Avalon Assessment
No ratings yet
2017 Year 9 English Destroying Avalon Assessment
2 pages
English 7 Q2 PDF
No ratings yet
English 7 Q2 PDF
23 pages

Parsing Conc

Uploaded by

Parsing Conc

Uploaded by

Superior and Efficient Fully Unsupervised Pattern-based Concept

Acquisition Using an Unsupervised Parser

Dmitry Davidov1 Roi Reichart1 Ari Rappoport2

Abstract While handcrafted concept databases (e.g., Word-

You might also like