Dependency Parsing - Part II
Pawan Goyal
CSE, IIT Kharagpur
October 16, 2014
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
1 / 48
Maximum Spanning Tree Based
Basic Idea
Starting from all possible connections, find the maximum spanning tree.
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
2 / 48
Directed Spanning Trees
A directed spanning tree of a (multi-)digraph G = (V, A) is a subgraph
G0 = (V 0 , A0 ) such that:
I V0 = V
I A0 A, and |A0 | = |V 0 | 1
I G0 is a tree (acyclic)
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
3 / 48
Directed Spanning Trees
A directed spanning tree of a (multi-)digraph G = (V, A) is a subgraph
G0 = (V 0 , A0 ) such that:
I V0 = V
I A0 A, and |A0 | = |V 0 | 1
I G0 is a tree (acyclic)
A spanning tree of the following (multi-)digraphs
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
3 / 48
Weighted Directed Spanning Trees
Assume we have a weight function for each arc in a multi-digraph
G = (V, A).
Define wij k 0 to be the weight of (i, j, k) A for a multi-digraph
Define the weight of directed spanning tree G0 of graph G as
w(G0 ) =
wij k
(i,j,k)G0
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
4 / 48
Maximum Spanning Trees (MST)
Let T(G) be the set of all spanning trees for graph G
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
5 / 48
Maximum Spanning Trees (MST)
Let T(G) be the set of all spanning trees for graph G
The MST problem
Find the spanning tree G0 of the graph G that has the highest weight
G0 = arg max w(G0 ) = arg max
G0 T(G)
Pawan Goyal (IIT Kharagpur)
wij k
G0 T(G) (i,j,k)G0
Dependency Parsing - Part II
October 16, 2014
5 / 48
Finding MST
Directed Graph
For each sentence x, define the directed graph Gx = (Vx , Ex ) given by
Vx = {x0 = root, x1 , . . . , xn }
Ex = {(i, j) : i , j, (i, j) [0 : n] [1 : n]}
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
6 / 48
Finding MST
Directed Graph
For each sentence x, define the directed graph Gx = (Vx , Ex ) given by
Vx = {x0 = root, x1 , . . . , xn }
Ex = {(i, j) : i , j, (i, j) [0 : n] [1 : n]}
Gx is a graph with
the sentence words and the dummy root symbol as vertices and
a directed edge between every pair of distinct words and
a directed edge from the root symbol to every word
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
6 / 48
MST and Dependency Graph
Theorem
Every valid dependency graph for sentence x is equivalent to a directed
spanning tree for Gx that originates out of vertex 0
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
7 / 48
MST and Dependency Graph
Theorem
Every valid dependency graph for sentence x is equivalent to a directed
spanning tree for Gx that originates out of vertex 0
Three problems
Defining wij k corresponding to a feature space
Learning wij k from gold-standard data
At run-time, inferring the MST for a sentence x using the learnt weights
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
7 / 48
MST and Dependency Graph
Theorem
Every valid dependency graph for sentence x is equivalent to a directed
spanning tree for Gx that originates out of vertex 0
Three problems
Defining wij k corresponding to a feature space
Learning wij k from gold-standard data
At run-time, inferring the MST for a sentence x using the learnt weights
Lets talk about inference part first
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
7 / 48
Chu-Liu-Edmonds Algorithm
Chu-Liu-Edmonds Algorithm
Each vertex in the graph greedily selects the incoming edge with the
highest weight.
If a tree results, it must be a maximum spanning tree.
If not, there must be a cycle.
I
I
Identify the cycle and contract it into a single vertex.
Recalculate edge weights going into and out of the cycle.
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
8 / 48
Chu-Liu-Edmonds Algorithm
x = John saw Mary
Build the directed graph
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
9 / 48
Chu-Liu-Edmonds Algorithm
Find the highest scoring incoming arc for each vertex
If this is a tree, then we have found MST.
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
10 / 48
Chu-Liu-Edmonds Algorithm
If not a tree, identify cycle and contract
Recalculate arc weights into and out-of cycle
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
11 / 48
Chu-Liu-Edmonds Algorithm
Outgoing arc weights
Equal to the max of outgoing arc over all vertices in cycle
e.g., John Mary is 3 and saw Mary is 30.
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
12 / 48
Chu-Liu-Edmonds Algorithm
Incoming arc weights
Equal to the weight of best spanning tree that includes head of incoming
arc and all nodes in cycle
root saw John is 40
root John saw is 29
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
13 / 48
Chu-Liu-Edmonds Algorithm
Calling the algorithm again on the contracted graph:
This is a tree and the MST for the contracted graph
Go back up the recursive call and reconstruct final graph
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
14 / 48
Chu-Liu-Edmonds Algorithm
The edge from wjs to Mary was from saw
The edge from root to wjs represented a tree from root to saw to John.
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
15 / 48
Arc weights as linear classifiers
wij k = w.f (i, j, k)
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
16 / 48
Arc weights as linear classifiers
wij k = w.f (i, j, k)
Arc weights are a linear combination of features of the arc f (i, j, k) and a
corresponding weight vector w
What arc features?
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
16 / 48
Arc Features f (i, j, k)
Features
Identities of the words wi and wj for a label lk
head = saw & dependent=with
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
17 / 48
Arc Features f (i, j, k)
Features
Part-of-speech tags of the words wi and wj for a label lk
head-pos = Verb & dependent-pos=Preposition
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
18 / 48
Arc Features f (i, j, k)
Features
Part-of-speech of words surrounding and between wi and wj
inbetween-pos = Noun
inbetween-pos = Adverb
dependent-pos-right = Pronoun
head-pos-left=Noun
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
19 / 48
Arc Features f (i, j, k)
Features
Number of words between wi and wj , and their orientation
arc-distance = 3
arc-direction = right
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
20 / 48
Arc Features f (i, j, k)
Features
Combinations
head-pos=Verb & dependent-pos=Preposition & arc-label=PP
head-pos=Verb & dependent=with & arc-distance=3
No limit : any feature over arc (i, j, k) or input x
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
21 / 48
Learning the parameters
Re-write the inference problem
G = arg max
wij k
GT(Gx ) (i,j,k)G
= arg max w
GT(Gx )
f (i, j, k)
(i,j,k)G
= arg max w f (G)
GT(Gx )
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
22 / 48
Learning the parameters
Re-write the inference problem
G = arg max
wij k
GT(Gx ) (i,j,k)G
= arg max w
GT(Gx )
f (i, j, k)
(i,j,k)G
= arg max w f (G)
GT(Gx )
Which can be plugged into learning algorithms
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
22 / 48
Inference-based Learning
|T|
Training data: T = {(xt , Gt )}t=1
1.
2.
3.
4.
5.
6.
7.
8.
w(0) = 0; i = 0
for n : 1..N
for t : 1..|T|
Let G0 = argmaxG0 w(i) .f (G0 )
if G0 , G
w(i+1) = w(i) + f (Gt ) f (G0 )
i = i+1
return wi
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
23 / 48
Dependency Parsing as a Constraint Satisfaction problem
The last two approaches were data-driven
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
24 / 48
Constraint Satisfaction
Uses Constraint Dependency Grammar (CDG)
Grammatical rules are given as constraints on word-to-word modification
Parsing is formalized as a constraint satisfaction problem
Parsing is an eliminative process rather than a constructive process
Constraint satisfaction removes values that contradict constraints
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
25 / 48
Constraint Propagation
Three steps
Step 1. Form initial constraint network using a core grammar
Step 2. Remove local inconsistencies
Step 3. If ambiguity remains, add new constraints and repeat step 2.
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
26 / 48
Constraint Propagation Example
Senence: Put the block on the floor on the table in the room
Simplified representation: V1 NP2 PP3 PP4 PP5
Correct analysis
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
27 / 48
Initial Constraints
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
28 / 48
Initial Constraints
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
29 / 48
Initial Constraints
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
30 / 48
Initial Constraints
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
31 / 48
Adding New Constraints
Still 14 possible analyses.
Introduce more constrains:
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
32 / 48
Modified Tables
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
33 / 48
Modified Tables
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
34 / 48
Modified Tables
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
35 / 48
Modified Tables
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
36 / 48
Modified Tables
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
37 / 48
Modified Tables
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
38 / 48
Modified Tables
No object can be on two different objects.
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
39 / 48
Modified Tables
No object can be on two different objects.
Once every arc has been resolved, we get a unique dependency tree
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
39 / 48
Example from a Sanskrit Sentence: POS along with
dependency
Consider the sentence:
ramah
. vanam
. gacchati
1.
2.
3.
4.
5.
6.
7.
ramah
{masc.} {sg.} {nom.}
. = rama
ramah
=
r
a
{pr.}
{1p} {pl.}
.
vanam
. = vana {neu.} {sg.} {nom.}
vanam
. = vana {neu.} {sg.} {acc.}
gacchati = gam {pr.} {3p.} {sg.}
gacchati = gam {pr. part.} {masc.} {sg.} {loc.}
gacchati = gam {pr. part.} {neu.} {sg.} {loc.}
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
40 / 48
Example from a Sanskrit Sentence
1.
2.
3.
4.
5.
6.
7.
ramah
{masc.} {sg.} {nom.}
. = rama
ramah
=
r
a
{pr.}
{1p} {pl.}
.
vanam
. = vana {neu.} {sg.} {nom.}
vanam
. = vana {neu.} {sg.} {acc.}
gacchati = gam {pr.} {3p.} {sg.}
gacchati = gam {pr. part.} {masc.} {sg.} {loc.}
gacchati = gam {pr. part.} {neu.} {sg.} {loc.}
Figure: Possible Relations
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
41 / 48
Example from a Sanskrit Sentence
1.
2.
3.
4.
5.
6.
7.
Figure: Adjacency and Possible
ramah
{masc.} {sg.} {nom.}
. = rama
ramah
=
r
a
{pr.}
{1p} {pl.}
.
vanam
. = vana {neu.} {sg.} {nom.}
vanam
. = vana {neu.} {sg.} {acc.}
gacchati = gam {pr.} {3p.} {sg.}
gacchati = gam {pr. part.} {masc.} {sg.} {loc.}
gacchati = gam {pr. part.} {neu.} {sg.} {loc.}
A path P is a sequence of edges which
connects the nodes from S to F.
paths
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
42 / 48
Example from a Sanskrit Sentence
For example, S-1-3-5-F is a path.
Figure: A sample Path
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
43 / 48
How the Parser works?
Figure: Relations
Figure: Adjacency and Possible paths
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
44 / 48
How the Parser works?
Figure: Relations
Figure: Adjacency and Possible paths
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
45 / 48
How the Parser works?
Figure: Relations
Figure: Adjacency and Possible paths
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
46 / 48
How the Parser works?
Figure: Relations
Figure: Adjacency and Possible paths
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
47 / 48
How the Parser works?
Figure: Relations
Figure: Adjacency and Possible paths
Pawan Goyal (IIT Kharagpur)
Dependency Parsing - Part II
October 16, 2014
48 / 48