0% found this document useful (0 votes)
92 views64 pages

Natural Language Processing: Venue:ADB-405

This document provides an introduction to syntax in natural language processing. It defines syntax as the study of the structure and rules of language, including how words are arranged into phrases, clauses, and sentences. The document then discusses where syntax fits in relation to other areas of linguistics such as semantics, morphology, and phonology. It also outlines different techniques for analyzing syntax, including sequential and hierarchical breaking down of sentences and labeling constituents with parts of speech or syntactic functions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
92 views64 pages

Natural Language Processing: Venue:ADB-405

This document provides an introduction to syntax in natural language processing. It defines syntax as the study of the structure and rules of language, including how words are arranged into phrases, clauses, and sentences. The document then discusses where syntax fits in relation to other areas of linguistics such as semantics, morphology, and phonology. It also outlines different techniques for analyzing syntax, including sequential and hierarchical breaking down of sentences and labeling constituents with parts of speech or syntactic functions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 64

CSE528

Natural Language Processing


Venue:ADB-405 Topic: Syntax

Pr o f. Tu l asi Pr a sa d S a ri ki ,
S CSE, V I T C h ennai Ca mpus
www. l [Link] l y. com
Contents
 What is Syntax ?
 Where does it fit ?
 Simplified View of Linguistics
 Grammatical Analysis Techniques

INTRODUCTION TO SYNTAX
What is Syntax ?
 Study of structure of language
 Refers to the way words are arranged together, and the
relationship between them.
 Syntax is study of the system of rules and categories that underlies
sentence formation.
 Syntax is the study of the combination of words into phrases,
clauses and sentences.
 Syntax describes how sentences and their constituents are
structured.

INTRODUCTION TO SYNTAX
What is Syntax ?
 Roughly, goal is to relate surface form (what we perceive when
someone says something)
Specifically, goal is to relate an interface to morphological
component to an interface to a semantic component
 Note: interface to morphological component may look like written
text
 Representational device is tree structure

INTRODUCTION TO SYNTAX
Where does it fit ?

Semantics

Syntax

Lexicon

INTRODUCTION TO SYNTAX
Simplified View of Linguistics

Phonology  /waddyasai/

Morphology /waddyasai/  what did you say


say
Syntax what did you say  subj obj

say you what


Semantics subj obj
 P[ x. say(you, x) ]
you what
INTRODUCTION TO SYNTAX
Acronyms used in structural descriptions
of natural language
S=sentence/clause ADJP=adjective phrase
N=(a single) noun ADV=adverb
NP=noun phrase ADVP=adverb phrase
V=verb DET=determiner
VP=verb phrase CONJ=conjunction
AUX=auxiliary verb COMP=complementizer
AJ/ADJ=adjective PRO=pro-constituent
PUNC=punctuation

INTRODUCTION TO SYNTAX
Examples
S=sentence/clause Does the dog chase the cat?
N=(a single) noun dog
NP=noun phrase the old dog
V=verb chase
VP=verb phrase chase the cat
AUX=auxiliary verb does
AJ/ADJ=adjective old
ADJP=adjective phrase old and gray

INTRODUCTION TO SYNTAX
Examples
ADV=adverb happily
ADVP=adverb phrase once upon a time
DET=determiner the
CONJ=conjunction and
COMP=complementizer what
PRO= pro-constituent he
PUNC=punctuation ?

INTRODUCTION TO SYNTAX
Grammatical Analysis Techniques
Two main devices
Breaking up a String Labeling the Constituents
 Sequential  Morphological
 Hierarchical  Categorial
 Transformational  Functional

INTRODUCTION TO SYNTAX
Sequential Breaking up
That student solved the problems.

that + student + solve + ed + the + problem + s

INTRODUCTION TO SYNTAX
Sequential Breaking up and
Morphological Labeling
That student solved the problems.

that student solve ed the problem s

word word stem affix word stem affix

INTRODUCTION TO SYNTAX
Sequential Breaking up and
Categorial Labeling
This boy can solve the problem.
this boy can solve the problem

Det N Aux V Det N

They called her a taxi.


They call ed her a taxi

Pron V Affix Pron Det N


INTRODUCTION TO SYNTAX
Sequential Breaking up and
Functional Labeling
They called her a taxi

Subject Verbal Direct Indirect


Object Object

They called her a taxi

Subject Verbal Indirect Direct


Object Object
INTRODUCTION TO SYNTAX
Hierarchical Breaking up
Old men and Old men and women
women

Old men and women Old men and women

men and women


Old men

INTRODUCTION TO SYNTAX
Hierarchical Breaking up and Categorial
Labeling
Poor John Sran away.

NP VP

A N V Adv

Poor John ran away

INTRODUCTION TO SYNTAX
Hierarchical Breaking up and Functional Labeling
 Immediate Constituent (IC) Analysis

 Construction types in terms of the function of the constituents:


 Predication (subject + predicate)
 Modification (modifier + head)
 Complementation (verbal + complement)
 Subordination (subordinator + dependent unit)
 Coordination (independent unit + coordinator)

INTRODUCTION TO SYNTAX
Syntax as defined by Bloomfield
It is the study of free forms that are composed entirely of free forms.

Central notions of his theory


 Form classes and

 Constituent Structures

INTRODUCTION TO SYNTAX
Form-Classes
Form-Class – A set of forms displaying similar or identical
grammatical features is said to constitute a form-class, e.g.
‘Walk’, ‘come’, ‘run’, ‘jump’ - belong to the form-class of infinitive
expressions.
‘John’, ‘the boys’, ‘Mr. Smith’ – belong to the form-class of
nominative substantive expressions.
Form-Classes are similar to the traditional parts of speech.
One and the same form can belong to more than one form class.

INTRODUCTION TO SYNTAX
Form-Classes (contd.)
Criterion for form-class membership – Substitutability
In a sentence like – “John went to the Church”,
‘John’ can be substituted with ‘children’, ‘Mr. Smith’ or ‘the boys’
(as these are syntactically equivalent to each other and display
identical grammatical features).
Thus, form classes are sets of forms, any one of which may be
substituted for any other in a given construction.
The smaller forms into which a larger form may be analyzed are its
constituents, and the larger form is a construction.

INTRODUCTION TO SYNTAX
Example of the Constituents of a Construction
The phrase "poor John" is a construction analyzable into, or
composed of, the constituents "poor" and "John."
Similarly, the phrase "lost his watch" is composed of - "lost," "his,"
and "watch"-- all of which may be described as constituents of the
construction put together in a linear order.

INTRODUCTION TO SYNTAX
Constituency
Sentences or phrases can be analyzed as being composed of a
number of somewhat smaller units called constituents
(e.g. a Noun Phrase might consist of a determiner and a noun), and
This constituent analysis can be continued until no further
subdivisions are possible.
The major divisions that can be made are Immediate Constituents.
Ultimate Constituents - The irreducible elements of the construction
resulting from such an analysis.

INTRODUCTION TO SYNTAX
Immediate Constituents
An immediate constituent is the daughter of some larger unit that
constitute a construction. Immediate constituents are often further
reducible.
There exists no intermediate unit between them that is a constituent
of the same construction e.g.
in a construction ‘poor John,’ ‘poor’ and ‘John’ are immediate
constituents.

INTRODUCTION TO SYNTAX
Constructions
Subordinating Constructions - Constructions in which only one
immediate constituent is of the same form class as the whole
construction e.g. ‘poor John’, ‘fresh milk’.
The constituent that is syntactically equivalent to the whole
construction is described as the head, and its partner is described as
the modifier: thus, in "poor John," the form "John" is the head, and
"poor" is its modifier.

INTRODUCTION TO SYNTAX
Constructions (contd.)
Coordinating Constructions - Constructions in which both
constituents are of the same form class as the whole construction
e.g. ‘men and women’, ‘boys and girls’
“Men and women," in which, it may be assumed, the immediate
constituents are the word "men" and the word "women," each of
which is syntactically equivalent to "men and women."

INTRODUCTION TO SYNTAX
Immediate Constituent Structure
The organization of the units of a sentence (its immediate
constituents) both in terms of their hierarchical arrangement and
their linear order.
IC Structure can be represented in the form of a tree diagram or
Using labeled bracketing, each analytic decision being represented
by a pair of square brackets at the appropriate points in the
construction.

INTRODUCTION TO SYNTAX
Immediate Constituent Structure (contd.)
‘Poor John lost his watch’ is not just a linear sequence of five words.
It can be analyzed into the immediate constituents – ‘poor John’ and
‘lost his watch’
And each of these constituents is analyzable into its own immediate
constituents.
The Ultimate Constituents of the whole construction are- ‘poor’,
‘John’, ‘lost’, ‘his’, ‘watch’

INTRODUCTION TO SYNTAX
Immediate Constituent Structure (contd.)
In ‘poor John’ –
‘poor’ and ‘John’ are constituents as well as
Immediate constituents as there is no intermediate unit between
them that is a constituent of the same construction.
Similarly, in ‘lost his watch’ –
‘lost’, ‘his’ and ‘watch’ are constituents
Not all of them are immediate constituents.

INTRODUCTION TO SYNTAX
Immediate Constituent Structure (contd.)
In ‘lost his watch’ –
‘his’ and ‘watch’ combine to make the intermediate construction
called ‘his watch’
‘his watch’ now combines with ‘lost’ to give
‘lost his watch’.
‘his’ and ‘watch’ are the constituents of ‘his watch’ and
‘lost’ and ‘his watch’ are immediate constituents of ‘lost his watch’

INTRODUCTION TO SYNTAX
Representing Immediate Constituent Structure
The constituent structure of the whole sentence can
be represented by means of labeled bracketing e.g.
[ [ [Poor] [John] ] [ [lost] [ [his] [watch] ] ]
Or using a tree diagram for the same -

poor John lost

his watch

INTRODUCTION TO SYNTAX
Representing Immediate Constituent Structure
(contd.)
Labeled bracketing using Category Symbols :

[ [ [Poor]ADJ [John]N ]NP [ [lost]V [ [his]PRON [watch ]N ]NP ]VP ]S

‘Poor’ – ADJ ‘Poor John’ - NP


‘John’ – N ‘his watch’ - NP
Lost – V ‘lost his watch’ - VP
His – PRON ‘Poor John lost his watch’ - S
Watch - N
INTRODUCTION TO SYNTAX
Immediate Constituent Structure using Tree
Diagram
S
NP VP

ADJ N V NP

PRON N

Poor John lost his watch

INTRODUCTION TO SYNTAX
Importance of the notion of Immediate
Constituent
It helps to account for the syntactic ambiguity of certain
constructions.
A classic example is the phrase "old men and women," which may be
interpreted in two different ways:
[Link] associates "old" with "men and women”; the immediate
constituents are "old" and "men and women
[Link] the second associates “old” just with "men." immediate
constructions are "old men" and "women."

INTRODUCTION TO SYNTAX
Predication
The part of a sentence or clause containing a verb and stating
something about the subject.

[Birds]subject [fly]predicate
S

Subject Predicate

Birds fly
INTRODUCTION TO SYNTAX
Modification
[A]modifier [flower]head
John [slept]head [in the room]modifier
S

Subject Predicate

John Head Modifier

slept In the room

INTRODUCTION TO SYNTAX
Complementation
He [saw]verbal [a lake]complement
S

Subject Predicate

He Verbal Complement

saw a lake
complements are required to complete the meaning of
a sentence or a part of a sentence.
INTRODUCTION TO SYNTAX
Subordination
John slept [in]subordinator [the room]dependent unit
S

Subject Predicate

John Head Modifier

slept Subordinator Dependent Unit


is a way of combining sentences that makes one
sentence more important than the other. in the room
INTRODUCTION TO SYNTAX
Coordination
[John came in time] independent unit [but]coordinator [Mary was not ready] independent unit

Independent Unit Coordinator Independent Unit

John came in time but Mary was not ready

Coordination is a way of adding sentences together


INTRODUCTION TO SYNTAX
An Example
S In the morning, the sky looked much brighter.
Modifier Head

Subordinator DU Subject Predicate

Modifier
Head Modifier Head Verbal Complement

Modifier
Head

In the morning, the sky looked much brighter


INTRODUCTION TO SYNTAX
Hierarchical Breaking up and
Categorial / Functional Labeling
Hierarchical Breaking up coupled with Categorial /Functional
Labeling is a very powerful device.
But there are ambiguities which demand something more powerful.
E.g., Love of God
Someone loves God
God loves someone

INTRODUCTION TO SYNTAX
Hierarchical Breaking up

Categorial Labeling Functional Labeling

Love of God Love of God

Noun Prepositional Head Modifier


Phrase Phrase

Sub DU
love of God love of God
INTRODUCTION TO SYNTAX
Types of Generative Grammar
 Finite State Model
(sequential)
 Phrase Structure Model
(sequential + hierarchical) + (categorial)
 Transformational Model
(sequential + hierarchical + transformational) + (categorial + functional)

INTRODUCTION TO SYNTAX
Phrase Structure Grammar (PSG)
A phrase-structure grammar G consists of a four tuple (V, T, S, P)
V is a finite set of alphabets (or vocabulary)
◦ E.g., N, V, A, Adv, P, NP, VP, AP, AdvP, PP, student, sing, etc.

T is a finite set of terminal symbols: T  V


◦ E.g., student, sing, etc.

S is a distinguished non-terminal symbol, also called start symbol: S


V
P is a set of productions.

INTRODUCTION TO SYNTAX
Noun Phrases
John the student the intelligent student

NP NP NP

N Det N Det AdjP N

John the student the intelligent student

INTRODUCTION TO SYNTAX
Noun Phrase

his first five PhD students

NP

Det Ord Quant N N

his first five PhD students

INTRODUCTION TO SYNTAX
Noun Phrase
The five best students of my class

NP

Det Quant AP N PP

the five best students of my class

INTRODUCTION TO SYNTAX
Verb Phrases

can sing can hit the ball

VP VP

Aux V Aux V NP

can sing can hit the ball

INTRODUCTION TO SYNTAX
Verb Phrase
Can give a flower to Mary
VP

Aux V NP PP

can give a flower to Mary

INTRODUCTION TO SYNTAX
Verb Phrase

may make John the chairman


VP

Aux V NP NP

may make John the chairman

INTRODUCTION TO SYNTAX
Verb Phrase
may find the book very interesting

VP

Aux V NP AP

may find the book very interesting

INTRODUCTION TO SYNTAX
Prepositional Phrases
in the classroom near the river

PP PP

P NP P NP

in the classroom near the river

INTRODUCTION TO SYNTAX
Adjective Phrases

intelligent very honest fond of sweets

AP AP AP

A Degree A A PP

intelligent very honest fond of sweets

INTRODUCTION TO SYNTAX
Adjective Phrase
very worried that she might have done badly in the assignment
AP

Degree A S’

very worried

that she might have done badly in the


assignment
INTRODUCTION TO SYNTAX
Phrase Structure Rules
The boy hit the ball.
Rewrite Rules:
(i) S  NP VP
(ii) NP  Det N
(iii) VP  V NP
(iv) Det  the
(v) N  man, ball
(v) V  hit
We interpret each rule X  Y as the instruction rewrite X as Y.

INTRODUCTION TO SYNTAX
Derivation
The boy hit the ball.
Sentence
NP + VP (i)
Det + N + VP (ii)
Det + N + V + NP (iii)
The + N + V + NP (iv)
The + boy + V + NP (v)
The + boy + hit + NP (vi)
The + boy + hit + Det + N (ii)
The + boy + hit + the + N (iv)
The + boy + hit + the + ball (v)
INTRODUCTION TO SYNTAX
PSG Parse Tree
S The boy hit the ball.

NP VP

Det N V NP

the boy hit Det N

the ball

INTRODUCTION TO SYNTAX
PSG Parse Tree
S
John wrote those words in the Book of
Proverbs.
NP VP

PropN V NP PP

P NP

NP PP
John wrote those in
words
the of
book proverbs
INTRODUCTION TO SYNTAX
Penn POS Tags
John wrote those words in the Book of Proverbs.
[John/NNP ]
wrote/VBD
[ those/DT words/NNS ]
in/IN
[ the/DT Book/NN ]
of/IN
[ Proverbs/NNS ]

INTRODUCTION TO SYNTAX
Penn Treebank
John wrote those words in the Book of Proverbs.
(S (NP-SBJ (NP John))
(VP wrote
(NP those words)
(PP-LOC in
(NP (NP-TTL (NP the Book)
(PP of
(NP Prove rbs)))

INTRODUCTION TO SYNTAX
PSG Parse Tree
S Official trading in the
NP
shares will start in Paris
VP on Nov 6.
NP PP
Aux V PP PP
AP N P NP

official trading in the shares will start in Paris on Nov 6


INTRODUCTION TO SYNTAX
Penn POS Tags
Official trading in the shares will start in Paris on Nov 6.

[ Official/JJ trading/NN ]
in/IN
[ the/DT shares/NNS ]
will/MD start/VB in/IN
[ Paris/NNP ]
on/IN
[ Nov./NNP 6/CD ]
INTRODUCTION TO SYNTAX
Penn Treebank
Official trading in the shares will start in Paris on Nov 6.
( (S (NP-SBJ (NP Official trading)
(PP in
(NP the shares)))
(VP will
(VP start
(PP-LOC in
(NP Paris))
(PP-TMP on
(NP (NP Nov 6)
INTRODUCTION TO SYNTAX
Penn POS Tag Sset
Adjective: JJ Plural Noun: NNS
Adverb: RB Personal Pronoun: PP
Cardinal Number: CD Proper Noun: NP
Determiner: DT Verb base form: VB
Preposition: IN Modal verb: MD
Coordinating Conjunction CC Verb (3sg Pres): VBZ
Subordinating Conjunction: IN Wh-determiner: WDT
Singular Noun: NN Wh-pronoun: WP

INTRODUCTION TO SYNTAX
INTRODUCTION TO SYNTAX

You might also like