0% found this document useful (0 votes)
202 views31 pages

MST Dependency Parsing Guide

The document discusses MST-based dependency parsing. It describes how an MST parser works by first constructing a directed graph with words as vertices and all possible arcs between words, then applying the Chu-Liu-Edmonds algorithm to find the maximum spanning tree - the dependency parse with highest score. Arc weights represent dependency likelihoods and are modeled as linear classifiers over features of the words and their context. The algorithm is run to find the highest scoring parse tree.

Uploaded by

thejesh219
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
202 views31 pages

MST Dependency Parsing Guide

The document discusses MST-based dependency parsing. It describes how an MST parser works by first constructing a directed graph with words as vertices and all possible arcs between words, then applying the Chu-Liu-Edmonds algorithm to find the maximum spanning tree - the dependency parse with highest score. Arc weights represent dependency likelihoods and are modeled as linear classifiers over features of the words and their context. The algorithm is run to find the highest scoring parse tree.

Uploaded by

thejesh219
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

MST-based Dependency Parsing

Pawan Goyal

CSE, IIT Kharagpur

September 27th - October 3rd, 2016

Pawan Goyal (IIT Kharagpur) MST-based Dependency Parsing September 27th - October 3rd, 2016 1 / 24
Maximum Spanning Tree Based

Pawan Goyal (IIT Kharagpur) MST-based Dependency Parsing September 27th - October 3rd, 2016 2 / 24
Maximum Spanning Tree Based

Basic Idea
Starting from all possible connections, find the maximum spanning tree.

Pawan Goyal (IIT Kharagpur) MST-based Dependency Parsing September 27th - October 3rd, 2016 2 / 24
Maximum Spanning Tree Based

Basic Idea
Starting from all possible connections, find the maximum spanning tree.

Pawan Goyal (IIT Kharagpur) MST-based Dependency Parsing September 27th - October 3rd, 2016 2 / 24
Some Graph Theory Reminders

A graph G = (V, A) is a set of vertices V and arcs (i, j) A where i, j V .


Undirected graphs: (i, j) A (j, i) A
Directed graphs (digraphs) : (i, j) A ; (j, i) A

Pawan Goyal (IIT Kharagpur) MST-based Dependency Parsing September 27th - October 3rd, 2016 3 / 24
Multi-Digraphs

A multi-digraph is a digraph where multiple arcs between vertices are


possible
(i, j, k) A represents the kth arc from vertex i to vertex j.

Pawan Goyal (IIT Kharagpur) MST-based Dependency Parsing September 27th - October 3rd, 2016 4 / 24
Directed Spanning Trees

A directed spanning tree of a (multi-)digraph G = (V, A) is a subgraph


G0 = (V 0 , A0 ) such that:
I V0 = V
I A0 A, and |A0 | = |V 0 | 1
I G0 is a tree (acyclic)

Pawan Goyal (IIT Kharagpur) MST-based Dependency Parsing September 27th - October 3rd, 2016 5 / 24
Directed Spanning Trees

A directed spanning tree of a (multi-)digraph G = (V, A) is a subgraph


G0 = (V 0 , A0 ) such that:
I V0 = V
I A0 A, and |A0 | = |V 0 | 1
I G0 is a tree (acyclic)

A spanning tree of the following (multi-)digraphs

Pawan Goyal (IIT Kharagpur) MST-based Dependency Parsing September 27th - October 3rd, 2016 5 / 24
Weighted Directed Spanning Trees

Assume we have a weight function for each arc in a multi-digraph


G = (V, A).
Define wij k 0 to be the weight of (i, j, k) A for a multi-digraph
Define the weight of directed spanning tree G0 of graph G as
X
w(G0 ) = wij k
(i,j,k)G0

Pawan Goyal (IIT Kharagpur) MST-based Dependency Parsing September 27th - October 3rd, 2016 6 / 24
Maximum Spanning Trees (MST)

Let T(G) be the set of all spanning trees for graph G

Pawan Goyal (IIT Kharagpur) MST-based Dependency Parsing September 27th - October 3rd, 2016 7 / 24
Maximum Spanning Trees (MST)

Let T(G) be the set of all spanning trees for graph G

The MST problem


Find the spanning tree G0 of the graph G that has the highest weight
X
G0 = arg max w(G0 ) = arg max wij k
G0 T(G) G0 T(G) (i,j,k)G0

Pawan Goyal (IIT Kharagpur) MST-based Dependency Parsing September 27th - October 3rd, 2016 7 / 24
Finding MST

Directed Graph
For each sentence x, define the directed graph Gx = (Vx , Ex ) given by

Vx = {x0 = root, x1 , . . . , xn }

Ex = {(i, j) : i , j, (i, j) [0 : n] [1 : n]}

Pawan Goyal (IIT Kharagpur) MST-based Dependency Parsing September 27th - October 3rd, 2016 8 / 24
Finding MST

Directed Graph
For each sentence x, define the directed graph Gx = (Vx , Ex ) given by

Vx = {x0 = root, x1 , . . . , xn }

Ex = {(i, j) : i , j, (i, j) [0 : n] [1 : n]}

Gx is a graph with
the sentence words and the dummy root symbol as vertices and
a directed edge between every pair of distinct words and
a directed edge from the root symbol to every word

Pawan Goyal (IIT Kharagpur) MST-based Dependency Parsing September 27th - October 3rd, 2016 8 / 24
Chu-Liu-Edmonds Algorithm

Chu-Liu-Edmonds Algorithm
Each vertex in the graph greedily selects the incoming edge with the
highest weight.
If a tree results, it must be a maximum spanning tree.
If not, there must be a cycle.
I Identify the cycle and contract it into a single vertex.
I Recalculate edge weights going into and out of the cycle.

Pawan Goyal (IIT Kharagpur) MST-based Dependency Parsing September 27th - October 3rd, 2016 9 / 24
Chu-Liu-Edmonds Algorithm

x = John saw Mary

Build the directed graph

September 27th - October 3rd, 2016 10 /


Pawan Goyal (IIT Kharagpur) MST-based Dependency Parsing 24
Chu-Liu-Edmonds Algorithm

Find the highest scoring incoming arc for each vertex

If this is a tree, then we have found MST.

September 27th - October 3rd, 2016 11 /


Pawan Goyal (IIT Kharagpur) MST-based Dependency Parsing 24
Chu-Liu-Edmonds Algorithm

If not a tree, identify cycle and contract


Recalculate arc weights into and out-of cycle

September 27th - October 3rd, 2016 12 /


Pawan Goyal (IIT Kharagpur) MST-based Dependency Parsing 24
Chu-Liu-Edmonds Algorithm

Outgoing arc weights


Equal to the max of outgoing arc over all vertices in cycle
e.g., John Mary is 3 and saw Mary is 30.

September 27th - October 3rd, 2016 13 /


Pawan Goyal (IIT Kharagpur) MST-based Dependency Parsing 24
Chu-Liu-Edmonds Algorithm

Incoming arc weights


Equal to the weight of best spanning tree that includes head of incoming
arc and all nodes in cycle
root saw John is 40
root John saw is 29
September 27th - October 3rd, 2016 14 /
Pawan Goyal (IIT Kharagpur) MST-based Dependency Parsing 24
Chu-Liu-Edmonds Algorithm

Calling the algorithm again on the contracted graph:

This is a tree and the MST for the contracted graph


Go back up the recursive call and reconstruct final graph

September 27th - October 3rd, 2016 15 /


Pawan Goyal (IIT Kharagpur) MST-based Dependency Parsing 24
Chu-Liu-Edmonds Algorithm

The edge from wjs to Mary was from saw


The edge from root to wjs represented a tree from root to saw to John.

September 27th - October 3rd, 2016 16 /


Pawan Goyal (IIT Kharagpur) MST-based Dependency Parsing 24
Arc weights as linear classifiers

wij k = w.f (i, j, k)

September 27th - October 3rd, 2016 17 /


Pawan Goyal (IIT Kharagpur) MST-based Dependency Parsing 24
Arc weights as linear classifiers

wij k = w.f (i, j, k)

Arc weights are a linear combination of features of the arc f (i, j, k) and a
corresponding weight vector w
What arc features?

September 27th - October 3rd, 2016 17 /


Pawan Goyal (IIT Kharagpur) MST-based Dependency Parsing 24
Arc Features f (i, j, k)

Features
Identities of the words wi and wj for a label lk
head = saw & dependent=with

September 27th - October 3rd, 2016 18 /


Pawan Goyal (IIT Kharagpur) MST-based Dependency Parsing 24
Arc Features f (i, j, k)

Features
Part-of-speech tags of the words wi and wj for a label lk
head-pos = Verb & dependent-pos=Preposition

September 27th - October 3rd, 2016 19 /


Pawan Goyal (IIT Kharagpur) MST-based Dependency Parsing 24
Arc Features f (i, j, k)

Features
Part-of-speech of words surrounding and between wi and wj

inbetween-pos = Noun
inbetween-pos = Adverb
dependent-pos-right = Pronoun
head-pos-left=Noun
September 27th - October 3rd, 2016 20 /
Pawan Goyal (IIT Kharagpur) MST-based Dependency Parsing 24
Arc Features f (i, j, k)

Features
Number of words between wi and wj , and their orientation

arc-distance = 3
arc-direction = right

September 27th - October 3rd, 2016 21 /


Pawan Goyal (IIT Kharagpur) MST-based Dependency Parsing 24
Arc Features f (i, j, k)

Features
Combinations
head-pos=Verb & dependent-pos=Preposition & arc-label=PP
head-pos=Verb & dependent=with & arc-distance=3
No limit : any feature over arc (i, j, k) or input x

September 27th - October 3rd, 2016 22 /


Pawan Goyal (IIT Kharagpur) MST-based Dependency Parsing 24
Learning the parameters

Re-write the inference problem


X
G = arg max wij k
GT(Gx ) (i,j,k)G

X
= arg max w f (i, j, k)
GT(Gx ) (i,j,k)G

= arg max w f (G)


GT(Gx )

September 27th - October 3rd, 2016 23 /


Pawan Goyal (IIT Kharagpur) MST-based Dependency Parsing 24
Learning the parameters

Re-write the inference problem


X
G = arg max wij k
GT(Gx ) (i,j,k)G

X
= arg max w f (i, j, k)
GT(Gx ) (i,j,k)G

= arg max w f (G)


GT(Gx )

Which can be plugged into learning algorithms

September 27th - October 3rd, 2016 23 /


Pawan Goyal (IIT Kharagpur) MST-based Dependency Parsing 24
Inference-based Learning

|T|
Training data: T = {(xt , Gt )}t=1
1. w(0) = 0; i = 0
2. for n : 1..N
3. for t : 1..|T|
4. Let G0 = argmaxG0 w(i) .f (G0 )
5. if G0 , G
6. w(i+1) = w(i) + f (Gt ) f (G0 )
7. i = i+1
8. return wi

September 27th - October 3rd, 2016 24 /


Pawan Goyal (IIT Kharagpur) MST-based Dependency Parsing 24

You might also like