CHAPTER NO 3
Notes on Describing Syntax and Semantics
1. Introduction
● Describing a programming language clearly is crucial for its success.
● Poor descriptions (like ALGOL 60 and ALGOL 68) made languages hard to
understand, reducing their adoption.
● Informal descriptions lead to different dialects (variations of the same language).
Who needs language descriptions?
1. Initial evaluators – People reviewing the language before finalizing it.
2. Implementors – Developers who build compilers/interpreters need precise rules.
3. Users – Programmers who write code need clear manuals.
2. Syntax vs. Semantics
● Syntax → Structure of code (e.g., while (condition) statement).
● Semantics → Meaning of code (e.g., while keeps executing as long as
condition is true).
● Syntax is easier to describe formally than semantics.
3. Describing Syntax
3.1 General Problem
● A language is a set of valid strings (e.g., valid Java programs).
● Syntax rules define which strings are correct.
● Programs are made of lexemes (smallest meaningful units like +, if, 123).
● Lexemes are grouped into tokens (categories like identifier, operator).
Example:
Statement: index = 2 * count + 17;
Lexeme Token
index identifier
= equal_sign
2 int_literal
* mult_op
count identifier
+ plus_op
17 int_literal
; semicolon
3.2 Language Recognizers vs. Generators
● Recognizer (e.g., compiler) checks if a program follows syntax rules.
● Generator produces valid programs (used for documentation).
4. Formal Methods of Describing Syntax
4.1 Backus-Naur Form (BNF) & Context-Free Grammars
● Developed by Noam Chomsky (linguistics) and John Backus (ALGOL).
● BNF is a metalanguage (a language to describe other languages).
4.2 BNF Basics
● Uses rules (productions) like:
<assign> → <var> = <expression>
○ LHS (Left-Hand Side): Nonterminal (abstraction, e.g., <assign>).
○ RHS (Right-Hand Side): Mix of terminals (=) and nonterminals (<var>).
Example:
● Rule for Java if:
● bnf
● Copy
● Download
<if_stmt> → if (<logic_expr>) <stmt>
● | if (<logic_expr>) <stmt> else <stmt>
4.3 Describing Lists (Using Recursion)
● Example: List of identifiers
● bnf
● Copy
● Download
<ident_list> → identifier
● | identifier, <ident_list>
4.4 Derivations & Parse Trees
● Derivation: Sequence of rule applications starting from <program>.
● Leftmost Derivation: Replace leftmost nonterminal first.
Example Grammar:
bnf
Copy
Download
<program> → begin <stmt_list> end
<stmt_list> → <stmt> | <stmt> ; <stmt_list>
<stmt> → <var> = <expression>
<var> → A | B | C
<expression> → <var> + <var> | <var> - <var> | <var>
Derivation Example:
1. <program> => begin <stmt_list> end
2. => begin <stmt> ; <stmt_list> end
3. => begin <var> = <expression> ; <stmt_list> end
4. => begin A = B + C ; B = C end
Parse Tree:
● Shows hierarchical structure (see Fig. 3.1 in original text).
4.5 Ambiguity
● A grammar is ambiguous if a sentence has multiple parse trees.
● Example (ambiguous grammar for expressions):
● bnf
● Copy
● Download
● <expr> → <expr> + <expr> | <expr> * <expr> | (<expr>) | <id>
○ The expression A + B * C can be parsed in two ways (changing
meaning).
Key Takeaways
● BNF is used to formally define syntax.
● Lexemes (e.g., +, while) form tokens (e.g., operator, keyword).
● Derivations show how sentences are generated.
● Parse trees represent structure.
● Ambiguity occurs when a sentence has multiple interpretations.
Examples help clarify concepts—like how A = B * (A + C) is derived and parsed.
. Ambiguous Grammars
● A grammar is ambiguous if a sentence can have multiple parse trees.
● Example:
● bnf
● Copy
● Download
● <expr> → <expr> + <expr> | <expr> * <expr> | (<expr>) | <id>
○ The expression A = B + C * A has two interpretations:
1. (B + C) * A
2. B + (C * A)
Why is ambiguity bad?
● Compilers use parse trees to generate code.
● Different trees → different meanings → incorrect code.
How to detect ambiguity?
1. A sentence has multiple leftmost derivations.
2. A sentence has multiple rightmost derivations.
Fixing ambiguity:
● Rewrite the grammar to enforce precedence/associativity.
2. Operator Precedence & Associativity
● Precedence defines evaluation order (e.g., * before +).
● Associativity defines order for same operators (e.g., left: A + B + C = (A +
B) + C).
Example (Unambiguous Grammar):
bnf
Copy
Download
<expr> → <expr> + <term> | <term>
<term> → <term> * <factor> | <factor>
<factor> → (<expr>) | <id>
● + and * are forced into correct levels in the parse tree.
Left vs. Right Recursion:
● Left recursion (<expr> → <expr> + <term>) → left associativity.
● Right recursion (<factor> → <exp> ** <factor>) → right associativity
(used for exponentiation).
3. The "Dangling Else" Problem
● Ambiguity in if-then-else:
● bnf
● Copy
● Download
<if_stmt> → if <logic_expr> then <stmt>
● | if <logic_expr> then <stmt> else <stmt>
○ The statement:
○ java
○ Copy
○ Download
○ if (x > 0) if (y > 0) a = 1; else a = 2;
Could mean:
1. else pairs with the first if.
2. else pairs with the second if.
Solution:
● Distinguish matched (has else) and unmatched (no else) statements.
● Rewritten grammar:
● bnf
● Copy
● Download
<stmt> → <matched> | <unmatched>
<matched> → if <logic_expr> then <matched> else <matched>
| (non-if statements)
<unmatched> → if <logic_expr> then <stmt>
● | if <logic_expr> then <matched> else <unmatched>
4. Extended BNF (EBNF)
● Makes grammars more readable:
○ [ ] = Optional (e.g., [else <stmt>]).
○ { } = Repetition (e.g., {<stmt>} = zero or more).
○ ( | ) = Choice (e.g., (* | / | %)).
Example:
● BNF:
● bnf
● Copy
● Download
● <expr> → <expr> + <term> | <expr> - <term> | <term>
● EBNF:
● bnf
● Copy
● Download
● <expr> → <term> {(+ | -) <term>}
5. Attribute Grammars
● Extend BNF to describe static semantics (e.g., type checking).
Key Concepts:
1. Attributes (variables attached to grammar symbols):
○ Synthesized attributes (passed up the parse tree).
○ Inherited attributes (passed down/across the tree).
2. Semantic Functions (compute attribute values).
3. Predicate Functions (enforce rules, e.g., "variables must be declared").
Example:
● Rule: <assign> → <var> = <expr>
○ Predicate: Type of <var> must match type of <expr>.
Why Use Attribute Grammars?
● BNF can’t enforce rules like "declare before use".
● Helps compilers check correctness before execution.
Key Takeaways
● Ambiguity = Multiple parse trees → fixed by rewriting grammar.
● Precedence/Associativity enforced via grammar structure.
● EBNF simplifies grammar notation.
● Attribute Grammars handle static semantics (type rules, declarations).
Examples like A = B + C * A and if-then-else ambiguity illustrate real parsing
challenges.
Intrinsic Attributes
● Definition: Synthesized attributes of leaf nodes (tokens like variables, numbers)
whose values come from outside the parse tree (e.g., symbol tables).
● Example:
○ For a variable x, its actual_type (e.g., int or real) is fetched from the
symbol table.
○ Used to compute attributes for higher-level nodes (e.g., expressions).
2. Attribute Grammars in Practice
● Example: Checking if an Ada procedure’s name matches its end label:
● bnf
● Copy
● Download
● <proc_def> → procedure <proc_name>[1] <proc_body> end <proc_name>[2];
Predicate: <proc_name>[1].string == <proc_name>[2].string
○ Ensures consistency in procedure naming.
● Type Checking Example:
○ Grammar for assignments like A = B + C:
○ bnf
○ Copy
○ Download
<assign> → <var> = <expr>
<expr> → <var> + <var> | <var>
○ <var> → A | B | C
○ Attributes:
1. actual_type: Synthesized (computed from children).
2. expected_type: Inherited (propagated from parent).
○ Rules:
1. <expr>.expected_type ← <var>.actual_type (from LHS of =).
2. For B + C, actual_type is real if types differ, else their common
type.
3. Predicate: actual_type must match expected_type.
3. Denotational Semantics
● Goal: Map language constructs to mathematical objects (e.g., numbers,
functions).
● Example: Binary numbers → Decimal values:
● bnf
● Copy
● Download
● <bin_num> → '0' | '1' | <bin_num> '0' | <bin_num> '1'
Semantic Function:
○ Mbin('0') = 0, Mbin('1') = 1
○ Mbin(<bin_num> '0') = 2 * Mbin(<bin_num>)
○ Mbin(<bin_num> '1') = 2 * Mbin(<bin_num>) + 1
● State in Programs:
○ Program state s = {<var1, val1>, <var2, val2>, ...}.
○ VARMAP(var, s) fetches var’s value from state s.
● Expression Semantics:
○ For E = x + y, Me(E, s) checks:
■ If x or y is undefined → error.
■ Else computes VARMAP(x, s) + VARMAP(y, s).
● Assignment Semantics:
○ Ma(x = E, s): Updates state s with x mapped to Me(E, s).
● Loop Semantics:
○ Ml(while B do L, s):
■ If B is false → return s.
■ Else recurse with Ml(while B do L, Msl(L, s)).
4. Key Takeaways
● Intrinsic Attributes: External data (e.g., symbol tables) define leaf-node attributes.
● Attribute Grammars:
○ Enforce static semantics (e.g., type matching).
○ Use synthesized (bottom-up) and inherited (top-down) attributes.
● Denotational Semantics:
○ Maps syntax to math objects (e.g., numbers, functions).
○ Describes state changes mathematically (no machine model).
○ Examples: Binary numbers → decimals, assignment updates.
Example: For A = A + B (assuming A: real, B: int):
1. actual_type of A + B = real (mixed types).
2. Predicate: LHS (A) matches real → Valid.
This approach ensures precise, mathematical reasoning about program behavior.
Notes on Axiomatic Semantics for Program Verification
1. Assignment Statements
● Axiom: {Q[x → E]} x = E {Q}
○ Replace all instances of x in postcondition Q with E to get the precondition.
● Example:
● text
● Copy
● Download
a = b / 2 - 1 {a < 10}
● Precondition: b / 2 - 1 < 10 → b < 22
● Side Effects: The axiom assumes no side effects (e.g., no changes to other
variables).
2. Rule of Consequence
● Strengthen preconditions or weaken postconditions:
● text
● Copy
● Download
{P} S {Q}, P' ⇒ P, Q ⇒ Q'
--------------------------
● {P'} S {Q'}
● Example:
○ Given {x > 3} x = x - 3 {x > 0} and x > 5 ⇒ x > 3,
○ Conclude {x > 5} x = x - 3 {x > 0}.
3. Sequences
● Inference Rule:
● text
● Copy
● Download
{P1} S1 {P2}, {P2} S2 {P3}
----------------------------
● {P1} S1; S2 {P3}
● Example:
● text
● Copy
● Download
y = 3 * x + 1;
x = y + 3;
● {x < 10}
○ Precondition for x = y + 3: y < 7.
○ Precondition for y = 3 * x + 1: x < 2.
4. Selection (if-then-else)
● Inference Rule:
● text
● Copy
● Download
{B ∧ P} S1 {Q}, {¬B ∧ P} S2 {Q}
---------------------------------
● {P} if B then S1 else S2 {Q}
● Example:
● text
● Copy
● Download
● if x > 0 then y = y - 1 else y = y + 1 {y > 0}
○ For y = y - 1: Precondition y > 1.
○ For y = y + 1: Precondition y > -1.
○ Combined precondition: y > 1 (stronger condition).
5. Loops (while)
● Loop Invariant (I): Must satisfy:
○ P ⇒ I (Precondition implies invariant).
○ {I ∧ B} S {I} (Invariant holds after each iteration).
○ I ∧ ¬B ⇒ Q (Loop exit ensures postcondition).
○ Termination must be proven separately.
● Example:
● text
● Copy
● Download
● {y ≤ x} while y ≠ x do y = y + 1 end {y = x}
○ Invariant: y ≤ x.
○ Termination: y increments until y = x.
6. Program Proofs
● Example 1: Swapping Variables
● text
● Copy
● Download
{x = A ∧ y = B}
t = x;
x = y;
y = t;
● {x = B ∧ y = A}
○ Apply assignment axiom backward to derive the precondition.
● Example 2: Factorial Calculation
● text
● Copy
● Download
{n ≥ 0}
count = n;
fact = 1;
while count ≠ 0 do
fact = fact * count;
count = count - 1;
end
● {fact = n!}
○ Invariant: fact = (count + 1) * ... * n ∧ count ≥ 0.
○ Proves correctness by induction.
7. Evaluation
● Strengths:
○ Rigorous framework for proving program correctness.
○ Useful for reasoning about loops and complex logic.
● Limitations:
○ Difficult to define for full languages (e.g., pointers, side effects).
○ Loop invariants require ingenuity.
○ Primarily used for verification, not compiler design.
Key Takeaways
● Axiomatic semantics uses preconditions/postconditions to define meaning.
● Assignment axiom: Substitute to compute weakest preconditions.
● Loops require a loop invariant and termination proof.
● Practical for proofs but challenging for full language specifications.
Example: Proving a factorial loop correct ensures it computes n! for all n ≥ 0.
EXERCISE
. Define syntax and semantics.
● Syntax: The structure/form of valid expressions, statements, and programs in a
language (e.g., while (condition) statement).
● Semantics: The meaning/behavior of those constructs (e.g., while executes
statement repeatedly while condition is true).
2. Who are language descriptions for?
● Initial evaluators (design reviewers).
● Implementors (compiler writers).
● Users (programmers).
3. Describe a general language generator.
A device (e.g., grammar) that produces valid sentences (programs) of a language by
applying rules recursively (e.g., BNF).
4. Describe a general language recognizer.
A device (e.g., parser) that checks if a given string (program) belongs to the language by
validating its structure against rules.
5. Difference between sentence and sentential form.
● Sentence: A string of terminals (valid program).
● Sentential form: Intermediate string in a derivation (mix of
terminals/nonterminals).
6. Define left-recursive grammar rule.
A rule where the LHS appears at the start of the RHS (e.g., <expr> → <expr> +
<term>).
7. Three EBNF extensions.
1. [ ] for optional elements.
2. { } for repetition (0+ times).
3. ( | ) for choice among alternatives.
8. Static vs. dynamic semantics.
● Static: Rules checked at compile-time (e.g., type compatibility).
● Dynamic: Runtime behavior (e.g., loop execution).
9. Purpose of predicates in attribute grammars.
Enforce static semantic rules (e.g., type checks, declaration before use).
10. Synthesized vs. inherited attributes.
● Synthesized: Computed from child nodes (bottom-up).
● Inherited: Passed down from parent/siblings (top-down).
11. Order of attribute evaluation.
Determined by dependency graphs (evaluate attributes only when their dependencies
are resolved).
12. Primary use of attribute grammars.
Describe static semantics (e.g., type checking) and aid compiler design.
13. Uses of semantic description methods.
● Programmers: Understand language behavior.
● Designers: Detect ambiguities.
● Compiler writers: Implement correct translations.
14. Why not machine languages for operational semantics?
Too low-level (complex state changes); use abstract interpreters instead.
15. Two levels of operational semantics.
1. Natural: Focus on final result.
2. Structural: Detailed state changes per step.
16. Syntactic vs. semantic domains in denotational semantics.
● Syntactic: Program constructs (e.g., expressions).
● Semantic: Mathematical objects (e.g., integers, functions).
17. State in denotational semantics.
Set of variable-value pairs (e.g., {<x, 5>, <y, 10>}).
18. Most widely known semantics approach.
Denotational semantics (rigorous, math-based).
19. Requirements for denotational description.
1. Mathematical object for each entity.
2. Mapping function from syntax to semantics.
20. Antecedent in inference rules.
Top part (e.g., {P} S {Q} is the antecedent in {P} S {Q}, P ⇒ P’).
21. Predicate transformer function.
Function wp(S, Q) computes the weakest precondition for statement S to achieve
postcondition Q.
22. Partial correctness for loops.
Loop postcondition holds if the loop terminates (no guarantee of termination).
23. Basis of axiomatic semantics.
Predicate calculus (logic).
24. Basis of denotational semantics.
Recursive function theory (mathematics).
25. Problem with pure interpreters for operational semantics.
Too implementation-dependent; abstract machines are preferred.
26. Preconditions/postconditions in axiomatic semantics.
● Precondition (P): Constraints before execution.
● Postcondition (Q): Constraints after execution.
27. Proving correctness with axiomatic semantics.
1. Start with postcondition.
2. Compute weakest preconditions backward through statements.
3. Show initial precondition implies derived precondition.
28. Denotational semantics concept.
Map language constructs to mathematical objects (e.g., numbers, functions) to define
meaning.
29. Operational vs. denotational semantics.
● Operational: Describes execution steps (like an interpreter).
● Denotational: Maps constructs to math objects (abstract meaning).