Language models
Human Communication 1
Lecture 6
• Two kinds of language models
• Grammar-based models
• Statistical models
• We still look at grammar-based models
today
1 2
Grammar Formal grammar
• More technically, a formal grammar consists of
• a finite set of terminal symbols
• Symbols • a finite set of nonterminal symbols
• Rules • a set of rules (also called production rules)
• Procedure of rule application • with a left- and a right-handed side
• each consisting of a word
• a start symbol
3 4
Special symbols Terminology
• Alphabet: A set of (terminal and
nonterminal) symbols
• Formal grammars usually have two special • Word: A string of symbols from an alphabet
symbols (what we also called ‘sentence’)
• S: the start symbol • Grammar: A set of rules defined on a
• ε: the empty string (sometimes: λ) alphabet
• Language: The set of words defined by a
grammar
5 6
Representing formal
Formal definition
grammar
A grammar G =〈Φ, ∑, R, S〉consists of
1. An alphabet Φ of nonterminal symbols, • Nonterminals are usually represented by
upper-case letters {S, A, B}
2. An alphabet ∑ of terminal symbols,
• Terminals by lower case letters {a, b, c}
3. A set R ⊆ Γ*╳Γ* of rules (where Γ = Φ ∪ ∑),
• The start symbols by S
4. A start symbol S ∈ Φ.
7 8
Example Solution
• Grammar • Apply rewrite rules until the result contains only symbols
from the alphabet.
• Alphabet: a, b
• We can rewrite
• Start symbol: S
• S to aSb by replacing S with aSb (rule 1);
• Rules: • aSb to aaSbb (rule 1)
1. S → aSb • aSb to abab (rule 2)
2. S → ba • For example: S → aSb → aaSbb → aababb
• The language of this grammar consists of the words
• What words are covered by this grammar? anbabn (where n are 0 or more occurrences of the
symbol, but the number of a’s and b’s is the same)
9 10
Chomsky hierarchy Type-3: regular
• 4 types of grammars (Type-0 to Type-3) • LHS: 1 nonterminal
• Type-0: recursively enumerable • RHS: 1 terminal and 0 or 1 nonterminals
• Type-1: context sensitive • Pattern:
• Type-2: context free (CFG) N→t
N1 → t N2 OR N1→ N2 t
• Type-3: regular
11 12
Type-2: context free Type-1: context
(CFG) sensitive
• LHS: 1 nonterminal • LHS: at least 1 nonterminal
• RHS: terminals and nonterminals • RHS: terminals and nonterminals
• Pattern: • Pattern:
αNβ → αγβ
N→γ
(where N is a nonterminal; α, β, γ are
(where N is a nonterminal; γ is a string of
strings of terminals and nonterminals; γ is
terminals and nonterminals)
not empty)
13 14
Type-1: context Type-0: recursively
sensitive enumerable
• All grammars and languages
• (Those that can be recognised by a Turing
• αNβ → αγβ Machine.)
• α and β are the context in which N can be • Pattern:
replaced by γ α→β
(where α and β are any string of terminals
and nonterminals, including the empty
string)
15 16
Chomsky hierarchy What type?
Type-0 · recursively enumerable S → aS
Type-1 · context sensitive S → abS
S→c
Type-2 · context free
Type-3 · regular
17 18
What type? What type?
S → aS S → aS
S → abS S → abS
S → Sb S → Bb
S→c B→ε
S → cngw
19 20
What type? What type?
S → aS S → aS
S → aABb S → aABb
A → Aa aA → Aa
A→a Aa → a
B → Bb B → Bb
B→b B→b
21 22
What type? Resource
S → aS
S → aABb
• The Wikipedia page on Chomsky
aAb → Aa Hierarchies is a good starting point for
formal grammars:
Aa → a http://en.wikipedia.org/wiki/
B → Bb Chomsky_grammar
Bb → b
23 24
Why the distinction? Human grammar
• Why is this distinction relevant?
• What type of grammar is human grammar?
• Answer: different computational complexity
• Probably in between context free and context
• This means: different amount of resources sensitive: mildly context-sensitive grammars
(MCSG)
needed
• This means in particular: different execution • MCSGs
times
• allow certain kinds of context dependencies
• Generally: The simpler the grammar type, the
• have low computational complexity (they
faster the parsing and generation
have polynomial complexity)
25 26
Parsing algorithms What is the problem?
• In addition to complexity of the grammar, • The simple parsing models have difficulties
there are also different parsing algorithms
• Parallel instead of serial processing: • Serial model requires too much time
If multiple rules apply, investigate all • Parallel models requires too much
possibilities at once. storage
• Problem: • Solution: Often a combination of serial and
A large number of possible analyses must be parallel parsing as well as top-down and
stored (probably exponentially many: length- bottom-up is used
of-sentencesome-constant).
27 28