UNIT - III Syntax Analyzer Santanu Halder 1 Compiler Design Role of Parser Santanu Halder 2 Compiler Design Role
Views 71 Downloads 0 File size 1MB
UNIT - III Syntax Analyzer Santanu Halder
1 Compiler Design
Role of Parser
Santanu Halder
2 Compiler Design
Role of Syntax Analyzer if (b == 0) a = b; Lexical Analysis or Scanner if
(
b
==
0
)
a
=
b
;
Syntax Analysis or Parsing if == b Santanu Halder
abstract syntax tree or parse tree
= 0
a
b 3 Compiler Design
Grammar Definition: Generally a grammar is represented as 4-tuples – G = Where V = finite set of variables (or non-terminals) T = finite set of terminal symbols P = finite set of productions S = Starting symbol Example: G = ({S}, {a, b}, {S→aSb, S→ab}, S) What language does this grammar accept?
Santanu Halder
4 Compiler Design
Grammar.. Definition: Generally a grammar is represented as 4-tuples – G = Where V = finite set of variables (or non-terminals) T = finite set of terminal symbols P = finite set of productions S = Starting symbol Example: G = ({S}, {a, b}, {S→aSb, S→ab}, S) What language does this grammar accept? L(G) = {anbn | n≥1}
Santanu Halder
5 Compiler Design
Chomsky Classification Type 0 or Unrestricted grammar: A grammar G = (V, T, P, S) is said to be unrestricted if all the production rules are of the form – A→α where A is in (VUT)+ and α is in (VUT)*
Santanu Halder
6 Compiler Design
Chomsky Classification.. Type 0 or Unrestricted grammar: A grammar G = (V, T, P, S) is said to be unrestricted if all the production rules are of the form – A→α where A is in (VUT)+ and α is in (VUT)* Example: S→S1B S1→aS1B bB→bbbB aS1b→aa B→ϵ
Santanu Halder
7 Compiler Design
Chomsky Classification.. Type 0 or Unrestricted grammar: A grammar G = (V, T, P, S) is said to be unrestricted if all the production rules are of the form – A→α where A is in (VUT)+ and α is in (VUT)* Example: S→S1B S1→aS1B bB→bbbB aS1b→aa B→ϵ L(G) = {an+1bn+k | n≥1, k=-1,1,3….} Santanu Halder
8 Compiler Design
Chomsky Classification.. Type 1 or Context-Sensitive grammar: A grammar G = (V, T, P, S) is said to be context-sensitive grammar if all the production rules are of the form – A→α where A, α is in (VUT)+ and |A| ≤ |α|
Santanu Halder
9 Compiler Design
Chomsky Classification.. Type 1 or Context-Sensitive grammar: A grammar G = (V, T, P, S) is said to be context-sensitive grammar if all the production rules are of the form – A→α where A, α is in (VUT)+ and |A| ≤ |α| Example: S→abc | aAbc Ab→bA Ac→Bbcc bB→Bb aB→aa | aaA
Santanu Halder
10 Compiler Design
Chomsky Classification.. Type 1 or Context-Sensitive grammar: A grammar G = (V, T, P, S) is said to be context-sensitive grammar if all the production rules are of the form – A→α where A, α is in (VUT)+ and |A| ≤ |α| Example: S→abc | aAbc Ab→bA Ac→Bbcc bB→Bb aB→aa | aaA L(G) = {anbncn | n≥1} Santanu Halder
11 Compiler Design
Chomsky Classification.. Type 2 or Context-Free grammar: A grammar G = (V, T, P, S) is said to be context-free grammar if all the production rules are of the form – A→α where A is in V and α is in (VUT)*
Santanu Halder
12 Compiler Design
Chomsky Classification.. Type 2 or Context-Free grammar: A grammar G = (V, T, P, S) is said to be context-free grammar if all the production rules are of the form – A→α where A is in V and α is in (VUT)* Example: S→aSa S→bSb S→ϵ
Santanu Halder
13 Compiler Design
Chomsky Classification.. Type 2 or Context-Free grammar: A grammar G = (V, T, P, S) is said to be context-free grammar if all the production rules are of the form – A→α where A is in V and α is in (VUT)* Example: S→aSa S→bSb S→ϵ L(G) = {WWR | W is in {a, b}*}
Santanu Halder
14 Compiler Design
Chomsky Classification.. Type 3 or Regular Grammar: A grammar G = (V, T, P, S) is said to be right linear grammar if all the production rules are of the form – A→xB or A→x where A, B is in V and x is in T* A grammar G = (V, T, P, S) is said to be left linear grammar if all the production rules are of the form – A→Bx or A→x where A, B is in V and x is in T* A regular grammar is one that is either left linear or right linear.
Santanu Halder
15 Compiler Design
Chomsky Classification.. Example: Right linear S→abS | a
Santanu Halder
L(G) = {(ab)*a}
16 Compiler Design
Chomsky Classification.. Example: Right linear S→abS | a Left Linear S→S1ab S1→S1ab | S2 S2→a
Santanu Halder
L(G) = {(ab)*a} L(G) = {aab(ab)*}
17 Compiler Design
Chomsky Classification.. Example: Right linear S→abS | a Left Linear S→S1ab S1→S1ab | S2 S2→a Linear grammar S→A A→aB | ϵ B→Ab
Santanu Halder
L(G) = {(ab)*a} L(G) = {aab(ab)*}
L(G) = {anbn | n ≥ 0}
18 Compiler Design
Context Free Grammar 1. Generate a CFG for L(G) = {anbn | n≥1}
Santanu Halder
19 Compiler Design
Context Free Grammar 1. Generate a CFG for L(G) = {anbn | n≥1} G = (V, T, P, S) V = {S}, T = {a, b}, P = {S→aSb, S→ab}
Santanu Halder
20 Compiler Design
Context Free Grammar 1. Generate a CFG for L(G) = {anbn | n≥1} G = (V, T, P, S) V = {S}, T = {a, b}, P = {S→aSb, S→ab} 2. Construct a CFG to generate the set of palindromes over alphabet {a, b}.
Santanu Halder
21 Compiler Design
Context Free Grammar 1. Generate a CFG for L(G) = {anbn | n≥1} G = (V, T, P, S) V = {S}, T = {a, b}, P = {S→aSb, S→ab} 2. Construct a CFG to generate the set of palindromes over alphabet {a, b}. G = (V, T, P, S) V = {S}, T = {a, b}, And for even palindrome – P = {S→aSa, S→bSb, S→ ε} And for odd palindrome – P = {S→aSa, S→bSb, S→a, S→b}
Santanu Halder
22 Compiler Design
Context Free Grammar 3. Design a CFG for a given language L(G) = {aibi | i≥0}.
Santanu Halder
23 Compiler Design
Context Free Grammar 3. Design a CFG for a given language L(G) = {aibi | i≥0}. G = (V, T, P, S) V = {S}, T = {a, b}, P = {S→aSb, S→ ε} 4. Design a CFG for L(G) = {aibicjdj | i,j≥0}.
Santanu Halder
24 Compiler Design
Context Free Grammar 3. Design a CFG for a given language L(G) = {aibi | i≥0}. G = (V, T, P, S) V = {S}, T = {a, b}, P = {S→aSb, S→ ε} 4. Design a CFG for L(G) = {aibicjdj | i,j≥0}. Let, L1 = {aibi | i ≥ 0} and L2 = {cjdj | j ≥ 0} Hence L(G) = L1·L2 L1 can be expressed as {A→aAb | ε} L2 can be expressed as {B→cBd | ε} Finally, P = {S→AB, A→aAb | ε, B→cBd | ε}
Santanu Halder
25 Compiler Design
Context Free Grammar 5. Generate a CFG for L(G) = {0i1j0k | j>i+k and i,k≥0}.
Santanu Halder
26 Compiler Design
Context Free Grammar 5. Generate a CFG for L(G) = {0i1j0k | j>i+k and i,k≥0}. L(G) = 0i1j0k for i, k ≥ 0 and j > i+k = 0i1i1m1k0k for m > 0
Santanu Halder
27 Compiler Design
Context Free Grammar 5. Generate a CFG for L(G) = {0i1j0k | j>i+k and i,k≥0}. L(G) = 0i1j0k for i, k ≥ 0 and j > i+k = 0i1i1m1k0k for m > 0 This can be rewritten as – L1 = {0i1i | i ≥ 0} L2 = {1m | m >0} L3 = {1k0k| k ≥ 0} And hence – L(G) = L1·L2·L3
Santanu Halder
28 Compiler Design
Context Free Grammar 5. Generate a CFG for L(G) = {0i1j0k | j>i+k and i,k≥0}. L(G) = 0i1j0k for i, k ≥ 0 and j > i+k = 0i1i1m1k0k for m > 0 This can be rewritten as – L1 = {0i1i | i ≥ 0} L2 = {1m | m >0} L3 = {1k0k| k ≥ 0} And hence – L(G) = L1·L2·L3 L1 can be expressed as {A→0A1 | ε} L2 can be expressed as {B→1B | 1} L3 can be expressed as {C→1C0 | ε} Finally, P = {S→ABC, A→0A1| ε, B→1B|1, C→1C0 | ε} Santanu Halder
29 Compiler Design
Context Free Grammar 6. Design a CFG for L(G) = {e5aibjcjdi | i,j>0}.
Santanu Halder
30 Compiler Design
Context Free Grammar 6. Design a CFG for L(G) = {e5aibjcjdi | i,j>0}. G = (V, T, P, S) V = {S, S1, S2, S3}, T = {a, b, c, d}, P = { S→e5aS1d, S1→ aS2d | bc, S2→aS2d | bS3c | bc, S3→bS3c | bc}
Santanu Halder
31 Compiler Design
Context Free Grammar 6. Design a CFG for L(G) = {e5aibjcjdi | i,j>0}. G = (V, T, P, S) V = {S, S1, S2, S3}, T = {a, b, c, d}, P = { S→e5aS1d, S1→ aS2d | bc, S2→aS2d | bS3c | bc, S3→bS3c | bc} 7. Design a CFG for L(G) = {WWR | W is a binary string}.
Santanu Halder
32 Compiler Design
Context Free Grammar 6. Design a CFG for L(G) = {e5aibjcjdi | i,j>0}. G = (V, T, P, S) V = {S, S1, S2, S3}, T = {a, b, c, d}, P = { S→e5aS1d, S1→ aS2d | bc, S2→aS2d | bS3c | bc, S3→bS3c | bc} 7. Design a CFG for L(G) = {WWR | W is a binary string}. G = (V, T, P, S) V = {S}, T = {0, 1}, P = {S→0S0 | 1S1 | ε}
Santanu Halder
33 Compiler Design
Context Free Grammar 8. Design a CFG for L(G) = (a+b)*.
Santanu Halder
34 Compiler Design
Context Free Grammar 8. Design a CFG for L(G) = (a+b)*. G = (V, T, P, S) V = {S}, T = {a, b}, P = {S→aS | Sb | a | b | ε}
Santanu Halder
35 Compiler Design
Context Free Grammar 8. Design a CFG for L(G) = (a+b)*. G = (V, T, P, S) V = {S}, T = {a, b}, P = {S→aS | Sb | a | b | ε} 9. Design a CFG for L(G) = {anbm | n≠m}.
Santanu Halder
36 Compiler Design
Context Free Grammar 8. Design a CFG for L(G) = (a+b)*. G = (V, T, P, S) V = {S}, T = {a, b}, P = {S→aS | Sb | a | b | ε} 9. Design a CFG for L(G) = {anbm | n≠m}. Case 1: n < m L = {anbnbi | i > 0} L1 = {anbn} L2 = {bi | i > 0}
Santanu Halder
37 Compiler Design
Context Free Grammar 8. Design a CFG for L(G) = (a+b)*. G = (V, T, P, S) V = {S}, T = {a, b}, P = {S→aS | Sb | a | b | ε} 9. Design a CFG for L(G) = {anbm | n≠m}. Case 1: n < m L = {anbnbi | i > 0} L1 = {anbn} L2 = {bi | i > 0} Hence, L1 can be expressed as : {A→aAb | ε} L2 can be expressed as : {B→bB | b} So, P1 = {S1→AB, A→aAb | ε, B→bB | b} Santanu Halder
38 Compiler Design
Context Free Grammar Case 2: n > m L = {aianbn | i > 0} L1 = {ai | i > 0} L2 = {anbn | i > 0}
Santanu Halder
39 Compiler Design
Context Free Grammar Case 2: n > m L = {aianbn | i > 0} L1 = {ai | i > 0} L2 = {anbn | i > 0} Hence, L1 can be expressed as : {C→aC | a} L2 can be expressed as : {D→aDb | ε} So, P2 = {S2→CD, C→aC | a, D→aDb | ε }
Santanu Halder
40 Compiler Design
Context Free Grammar Case 2: n > m L = {aianbn | i > 0} L1 = {ai | i > 0} L2 = {anbn | i > 0} Hence, L1 can be expressed as : {C→aC | a} L2 can be expressed as : {D→aDb | ε} So, P2 = {S2→CD, C→aC | a, D→aDb | ε } Finally – P = { S→S1 | S2, S1→AB, A→aAb | ε, B→bB | b S2→CD, C→aC | a, D→aDb | ε } Santanu Halder
41 Compiler Design
Leftmost Derivation A derivation is said to be leftmost if in each step of the derivation, the leftmost variable in the sentential form is replaced.
Santanu Halder
42 Compiler Design
Leftmost Derivation.. A derivation is said to be leftmost if in each step of the derivation, the leftmost variable in the sentential form is replaced. Example: Consider the grammar with productions – S→aAB A→bBb B→A | ϵ Leftmost derivation for the string abbbb is –
Santanu Halder
43 Compiler Design
Leftmost Derivation.. A derivation is said to be leftmost if in each step of the derivation, the leftmost variable in the sentential form is replaced. Example: Consider the grammar with productions – S→aAB A→bBb B→A | ϵ Leftmost derivation for the string abbbb is – S => aAB S→aAB
Santanu Halder
44 Compiler Design
Leftmost Derivation.. A derivation is said to be leftmost if in each step of the derivation, the leftmost variable in the sentential form is replaced. Example: Consider the grammar with productions – S→aAB A→bBb B→A | ϵ Leftmost derivation for the string abbbb is – S => aAB S→aAB => abBbB A→bBb
Santanu Halder
45 Compiler Design
Leftmost Derivation.. A derivation is said to be leftmost if in each step of the derivation, the leftmost variable in the sentential form is replaced. Example: Consider the grammar with productions – S→aAB A→bBb B→A | ϵ Leftmost derivation for the string abbbb is – S => aAB S→aAB => abBbB A→bBb => abAbB B→A
Santanu Halder
46 Compiler Design
Leftmost Derivation.. A derivation is said to be leftmost if in each step of the derivation, the leftmost variable in the sentential form is replaced. Example: Consider the grammar with productions – S→aAB A→bBb B→A | ϵ Leftmost derivation for the string abbbb is – S => aAB S→aAB => abBbB A→bBb => abAbB B→A => abbBbbB A→bBb
Santanu Halder
47 Compiler Design
Leftmost Derivation.. A derivation is said to be leftmost if in each step of the derivation, the leftmost variable in the sentential form is replaced. Example: Consider the grammar with productions – S→aAB A→bBb B→A | ϵ Leftmost derivation for the string abbbb is – S => aAB S→aAB => abBbB A→bBb => abAbB B→A => abbBbbB A→bBb => abbbbB B→ ϵ
Santanu Halder
48 Compiler Design
Leftmost Derivation.. A derivation is said to be leftmost if in each step of the derivation, the leftmost variable in the sentential form is replaced. Example: Consider the grammar with productions – S→aAB A→bBb B→A | ϵ Leftmost derivation for the string abbbb is – S => aAB S→aAB => abBbB A→bBb => abAbB B→A => abbBbbB A→bBb => abbbbB B→ ϵ => abbbb B→ ϵ
Santanu Halder
49 Compiler Design
Rightmost Derivation A derivation is said to be rightmost if in each step of the derivation, the rightmost variable in the sentential form is replaced.
Santanu Halder
50 Compiler Design
Rightmost Derivation.. A derivation is said to be rightmost if in each step of the derivation, the rightmost variable in the sentential form is replaced. Example: Consider the grammar with productions – S→aAB A→bBb B→A | ϵ Rightmost derivation for the string abbbb is –
Santanu Halder
51 Compiler Design
Rightmost Derivation.. A derivation is said to be rightmost if in each step of the derivation, the rightmost variable in the sentential form is replaced. Example: Consider the grammar with productions – S→aAB A→bBb B→A | ϵ Rightmost derivation for the string abbbb is – S => aAB S→aAB
Santanu Halder
52 Compiler Design
Rightmost Derivation.. A derivation is said to be rightmost if in each step of the derivation, the rightmost variable in the sentential form is replaced. Example: Consider the grammar with productions – S→aAB A→bBb B→A | ϵ Rightmost derivation for the string abbbb is – S => aAB S→aAB => aAA B→A
Santanu Halder
53 Compiler Design
Rightmost Derivation.. A derivation is said to be rightmost if in each step of the derivation, the rightmost variable in the sentential form is replaced. Example: Consider the grammar with productions – S→aAB A→bBb B→A | ϵ Rightmost derivation for the string abbbb is – S => aAB S→aAB => aAA B→A => aAbBb A→bBb
Santanu Halder
54 Compiler Design
Rightmost Derivation.. A derivation is said to be rightmost if in each step of the derivation, the rightmost variable in the sentential form is replaced. Example: Consider the grammar with productions – S→aAB A→bBb B→A | ϵ Rightmost derivation for the string abbbb is – S => aAB S→aAB => aAA B→A => aAbBb A→bBb => aAbb B→ ϵ
Santanu Halder
55 Compiler Design
Rightmost Derivation.. A derivation is said to be rightmost if in each step of the derivation, the rightmost variable in the sentential form is replaced. Example: Consider the grammar with productions – S→aAB A→bBb B→A | ϵ Rightmost derivation for the string abbbb is – S => aAB S→aAB => aAA B→A => aAbBb A→bBb => aAbb B→ ϵ => abBbbb A→bBb
Santanu Halder
56 Compiler Design
Rightmost Derivation.. A derivation is said to be rightmost if in each step of the derivation, the rightmost variable in the sentential form is replaced. Example: Consider the grammar with productions – S→aAB A→bBb B→A | ϵ Rightmost derivation for the string abbbb is – S => aAB S→aAB => aAA B→A => aAbBb A→bBb => aAbb B→ ϵ => abBbbb A→bBb => abbbb B→ ϵ
Santanu Halder
57 Compiler Design
Derivation Tree Let G=(V, T, P, S) be a CFG. An ordered tree is a derivation tree for G iff it has the following properties: 1. The root is labeled S. 2. Every leaf has a label from T U {ϵ}. 3. Every interior vertex has a label from V. 4. If a vertex has label A (which is in V), and its children are labeled from left to right as a1, a2, …, an, then P must contain a production of the form – A→a1a2…an 5. A leaf labeled ϵ has no siblings.
Santanu Halder
58 Compiler Design
Derivation Tree.. S
a
A
b
B
B
b
ϵ
ϵ Santanu Halder
59 Compiler Design
Simplification of CFG Steps to simplify a grammar – Step 1: Eliminate null production. Step 2: Eliminate unit production. Step 3: Eliminate production that contains useless variables. Useless variables are of two types – 1. Variables which are not deriving terminal string. 2. Variables that can’t be reached from start symbol.
Santanu Halder
60 Compiler Design
Simplification of CFG.. Elimination of null production: Let G={V, T, P, S} and G′={V, T, P′, S} Step 1: Say w1={A ϵ V | A→ ɛ is in P} wi+1=wi U {A ϵ V | there exists a production A→ x with x ϵ wi*} Stop when wi = wi+1 and wi is the set of all null able variables.
Santanu Halder
61 Compiler Design
Simplification of CFG.. Elimination of null production: Let G={V, T, P, S} and G′={V, T, P′, S} Step 1: Say w1={A ϵ V | A→ ɛ is in P} wi+1=wi U {A ϵ V | there exists a production A→ x with x ϵ wi*} Stop when wi = wi+1 and wi is the set of all null able variables. Step 2: a) Any productions, whose RHS does not have any nullable variable, is included in P′. b) If A→X1X2…Xk is in P, the productions of the form A→α1α2…αk are included in P′, where αi = Xi if Xi is not in wi. αi = Xi or ɛ if xi ϵ wi and α1α2…αk ≠ ɛ.
Santanu Halder
62 Compiler Design
Simplification of CFG.. Elimination of null production: Example: G = (V, T, P, S) V = {S, A, B, C, D} T = {a, b, d} P = {S→ABaC, A→BC, B→b | ɛ, C→D | ɛ, D→d}
Santanu Halder
63 Compiler Design
Simplification of CFG.. Elimination of null production: Example: G = (V, T, P, S) V = {S, A, B, C, D} T = {a, b, d} P = {S→ABaC, A→BC, B→b | ɛ, C→D | ɛ, D→d} Solution: w1 = {B, C} Since B→ɛ and C→ɛ
Santanu Halder
64 Compiler Design
Simplification of CFG.. Elimination of null production: Example: G = (V, T, P, S) V = {S, A, B, C, D} T = {a, b, d} P = {S→ABaC, A→BC, B→b | ɛ, C→D | ɛ, D→d} Solution: w1 = {B, C} Since B→ɛ and C→ɛ w2 = w1 U {A} Since A→BC = {A, B, C}
Santanu Halder
65 Compiler Design
Simplification of CFG.. Elimination of null production: Example: G = (V, T, P, S) V = {S, A, B, C, D} T = {a, b, d} P = {S→ABaC, A→BC, B→b | ɛ, C→D | ɛ, D→d} Solution: w1 = {B, C} Since B→ɛ and C→ɛ w2 = w1 U {A} Since A→BC = {A, B, C} w3 = w2 U {Φ} Since no more such productions = {A, B, C}
Santanu Halder
66 Compiler Design
Simplification of CFG.. Elimination of null production: Example: G = (V, T, P, S) V = {S, A, B, C, D} T = {a, b, d} P = {S→ABaC, A→BC, B→b | ɛ, C→D | ɛ, D→d} Solution: Hence – P′ = { S→ABaC|BaC|AaC|ABa|aC|Aa|Ba|a A→BC|C|B B→b C→D D→d}
Santanu Halder
67 Compiler Design
Simplification of CFG.. Elimination of Unit Production: Any production of a CFG of the form A→B where A, B ϵ V is called an unit production.
Santanu Halder
68 Compiler Design
Simplification of CFG.. Elimination of Unit Production: Any production of a CFG of the form A→B where A, B ϵ V is called an unit production. For the following grammar, S→Aa |B B→A|bb A→a|bc|B The unit productions are: S→B B→A A→B The dependency graph for the unit productions is – S
Santanu Halder
A
B
69 Compiler Design
Simplification of CFG.. Elimination of Unit Production: S
A
B
Hence, P′ contains: Original non-unit productions: S→Aa A→a | bc B→bb
Santanu Halder
70 Compiler Design
Simplification of CFG.. Elimination of Unit Production: S
A
B
Hence, P′ contains: Original non-unit productions: S→Aa A→a | bc B→bb The new rules: S→a | bc | bb A→bb B→a | bc Santanu Halder
71 Compiler Design
Simplification of CFG.. Elimination of Unit Production: S
A
B
Hence, P′ contains: Original non-unit productions: S→Aa A→a | bc B→bb The new rules: Altogether, S→a | bc | bb S→Aa | a | bc | bb A→bb A→a | bc | bb B→a | bc B→bb | a | bc Santanu Halder
72 Compiler Design
Simplification of CFG.. Elimination of variables not deriving terminal string: Let G={V, T, P, S} and G′={V′, T, P′, S} Step 1: Say w1={A ϵ V | A→ x is in P where xϵT*} wi+1=wi U {A ϵ V | there exists a production A→ x with x ϵ {T U wi}*} Stop when wi = wi+1 and then V′ = {wi}. Step 2: P′={A→x | A, x ϵ (V′UT)*}
Santanu Halder
73 Compiler Design
Simplification of CFG.. Elimination of variables not deriving terminal string: Example: G = (V, T, P, S) V = {S, A, B, C} T = {a, b, c} and P = { S→AB, A→a, B→b, B→C, E→c }
Santanu Halder
74 Compiler Design
Simplification of CFG.. Elimination of variables not deriving terminal string: Example: G = (V, T, P, S) V = {S, A, B, C} T = {a, b, c} and P = { S→AB, A→a, B→b, B→C, E→c } w1 = {A, B, E} Since A→a, B→b, E→c
Santanu Halder
75 Compiler Design
Simplification of CFG.. Elimination of variables not deriving terminal string: Example: G = (V, T, P, S) V = {S, A, B, C} T = {a, b, c} and P = { S→AB, A→a, B→b, B→C, E→c } w1 = {A, B, E} Since A→a, B→b, E→c w2 = w1 U {S} Since S→AB = {S, A, B, E}
Santanu Halder
76 Compiler Design
Simplification of CFG.. Elimination of variables not deriving terminal string: Example: G = (V, T, P, S) V = {S, A, B, C} T = {a, b, c} and P = { S→AB, A→a, B→b, B→C, E→c } w1 = {A, B, E} Since A→a, B→b, E→c w2 = w1 U {S} Since S→AB = {S, A, B, E} w3 = w2 U {Φ} Since no more such productions. = {S, A, B, E}
Santanu Halder
77 Compiler Design
Simplification of CFG.. Elimination of variables not deriving terminal string: Example: G = (V, T, P, S) V = {S, A, B, C} T = {a, b, c} and P = { S→AB, A→a, B→b, B→C, E→c } w1 = {A, B, E} Since A→a, B→b, E→c w2 = w1 U {S} Since S→AB = {S, A, B, E} w3 = w2 U {Φ} Since no more such productions. = {S, A, B, E} Hence, V′ = {S, A, B, E} So, P′ = { S→AB, A→a, B→b, E→c }
Santanu Halder
78 Compiler Design
Simplification of CFG.. Elimination of variables not reachable from start symbol: Let G={V, T, P, S} and G′={V′, T, P′, S} Step 1: Draw dependency graph for the variables. A variable is useful only if there is a path from the vertex labelled S. V′ = {useful variables} Step 2: P′={A→x | A, x ϵ (V′UT)*}
Santanu Halder
79 Compiler Design
Simplification of CFG.. Elimination of variables not reachable from start symbol: Example: G = (V, T, P, S) V = {S, A, B} T={a} P = { S→aS | A, A→a, B→aa }
Santanu Halder
80 Compiler Design
Simplification of CFG.. Elimination of variables not reachable from start symbol: Example: G = (V, T, P, S) V = {S, A, B} T={a} P = { S→aS | A, A→a, B→aa } The dependency graph for V is:
S
Santanu Halder
A
B
81 Compiler Design
Simplification of CFG.. Elimination of variables not reachable from start symbol: Example: G = (V, T, P, S) V = {S, A, B} T={a} P = { S→aS | A, A→a, B→aa } The dependency graph for V is:
S
A
B
Hence V′ = {S, A} So, P′ = {{ S→aS | A, A→a}
Santanu Halder
82 Compiler Design
Normal Forms for CFG In a context free grammar, then right hand side of a production rule can be any string of terminals and variables. When the production rules in a grammar satisfy certain restrictions, then the grammar is said to be in a normal form. The normal forms for CFG are of two types: 1. Chomsky Normal Form (CNF) 2. Greibach Normal Form (GNF)
Santanu Halder
83 Compiler Design
Chomsky Normal Form (CNF) Definition: A context free grammar is in CNF, if all the production rules are of the form: A → BC or, A → a Where A, B, C ϵ V and a ϵ T. Example: S → AS | a A → SA | b
Santanu Halder
84 Compiler Design
Greibach Normal Form (GNF) Definition: A context free grammar is said to be in GNF, if all the production rules are of the form: A → ax Where a ϵ T and x ϵ V*. Example: S → aAB | bBB | bB A → aA | bB | b B→b
Santanu Halder
85 Compiler Design
S - Grammar A context free grammar G = (V, T, P, S) is said to be a simple grammar or S-Grammar if all its productions are of the form – A→aX Where A is in V, a is in T and X is in V*, and any pair (A, a) occurs at most once in P.
Santanu Halder
86 Compiler Design
S - Grammar.. A context free grammar G = (V, T, P, S) is said to be a simple grammar or S-Grammar if all its productions are of the form – A→aX Where A is in V, a is in T and X is in V*, and any pair (A, a) occurs at most once in P. Example: S→aS S→aS S→bSS S→bSS S→c S→aSS S→c
Santanu Halder
87 Compiler Design
S - Grammar.. A context free grammar G = (V, T, P, S) is said to be a simple grammar or S-Grammar if all its productions are of the form – A→aX Where A is in V, a is in T and X is in V*, and any pair (A, a) occurs at most once in P. Example: S→aS S→aS S→bSS S→bSS S→c S→aSS S→c
Santanu Halder
88 Compiler Design
Ambiguous Grammar A context free grammar G = (V, T, P, S) is said to be ambiguous if there exists some w (which is in L(G)) that has at least two distinct derivation trees.
Santanu Halder
89 Compiler Design
Ambiguous Grammar.. Example: E→E+E | E*E | id id + id + id
Santanu Halder
90 Compiler Design
Ambiguous Grammar.. Example: E→E+E | E*E | id id + id + id
E E
+
E
id
E
+
id
Santanu Halder
E id 91 Compiler Design
Ambiguous Grammar.. Example: E→E+E | E*E | id id + id + id
E
E
E
+
E
id
E
+
id
Santanu Halder
E
E
id
id
E
+
E
+
E
id
id 92 Compiler Design
Ambiguous Grammar.. Example: E→E+E | E*E | id id + id * id
Santanu Halder
93 Compiler Design
Ambiguous Grammar.. Example: E→E+E | E*E | id id + id * id
E E
+
E
id
E
*
id
Santanu Halder
E id 94 Compiler Design
Ambiguous Grammar.. Example: E→E+E | E*E | id id + id * id
E
E
E
+
E
id
E
*
id
Santanu Halder
E
E
id
id
E
*
E
+
E
id
id 95 Compiler Design
Ambiguous Grammar.. Solution: Associate precedence rules with the operators. E→E+E | E*E | id Rewrite the production rules as follows to make it unambiguous – E→E+T | T T→T*F | F F→id
Santanu Halder
96 Compiler Design
Ambiguous Grammar.. E
E
E
+
T
E
+
T
+
T
F
T
T
*
T
F
id
F
F
F
id
id
id
E
F id
id
Santanu Halder
97 Compiler Design
Ambiguous Grammar.. Example: Rewrite the following production rules to make the grammar unambiguous. bExp → bExp or bExp → bExp and bExp → not bExp → True | False
Santanu Halder
98 Compiler Design
Ambiguous Grammar.. Example: Rewrite the following production rules to make the grammar unambiguous. bExp → bExp or bExp → bExp and bExp → not bExp → True | False Solution: E → E or T | T T → F and G | G G → not H | True | False
Santanu Halder
99 Compiler Design
Ambiguous Grammar.. Example: Rewrite the following production rules to make the grammar unambiguous. stmt → if expr then stmt stmt → if expr then stmt else stmt stmt → other
Santanu Halder
100 Compiler Design
Ambiguous Grammar.. Example: Rewrite the following production rules to make the grammar unambiguous. stmt → if expr then stmt stmt → if expr then stmt else stmt stmt → other Solution: Idea: A statement appearing between a then and an else must be matched stmt → match_stmt | open_stmt match_stmt → if expr then match_stmt else match_stmt match_stmt → other open_stmt → if expr then stmt open_stmt → if expr then match_stmt else open_stmt Santanu Halder
101 Compiler Design
Right Recursive Grammar A grammar G is said to be right recursive if it contains the following type of production rules: A→αA | β
Santanu Halder
102 Compiler Design
Left Recursive Grammar A grammar G is said to be left recursive if it contains the following type of production rules: A→Aα | β In other words, a grammar G is left recursive if it has a nonterminal A such that there is a derivation A+=> Aα for some string α. Top down parsing methods can not accept left recursive grammar.
Santanu Halder
103 Compiler Design
Elimination of Left Recursion Replace the following production rules: A→Aα | β By A → βA’ A’→ αA’ | ϵ
Santanu Halder
104 Compiler Design
Elimination of Left Recursion.. Replace the following production rules: A→Aα | β By A → βA’ A’→ αA’ | ϵ Example: E→E+T | T
Santanu Halder
105 Compiler Design
Elimination of Left Recursion.. Replace the following production rules: A→Aα | β By A → βA’ A’→ αA’ | ϵ Example: E→E+T | T After elimination of left recursion – E → TE’ E’→ +TE’ | ϵ
Santanu Halder
106 Compiler Design
Elimination of Left Recursion.. Replace the following production rules: A→Aα | β By A → βA’ A’→ αA’ | ϵ Example: S→S0S1S | 01
Santanu Halder
107 Compiler Design
Elimination of Left Recursion.. Replace the following production rules: A→Aα | β By A → βA’ A’→ αA’ | ϵ Example: S→S0S1S | 01 After elimination of left recursion – S → 01S’ S’→ 0S1SS’ | ϵ
Santanu Halder
108 Compiler Design
Elimination of Left Recursion.. Replace the following production rules: A→Aα | β By A → βA’ A’→ αA’ | ϵ Example: S → (L) | x L → L,S | S
Santanu Halder
109 Compiler Design
Elimination of Left Recursion.. Replace the following production rules: A→Aα | β By A → βA’ A’→ αA’ | ϵ Example: S → (L) | x not left recursive L → L,S | S left recursive
Santanu Halder
110 Compiler Design
Elimination of Left Recursion.. Replace the following production rules: A→Aα | β By A → βA’ A’→ αA’ | ϵ Example: S → (L) | x not left recursive L → L,S | S left recursive So, after elimination of left recursion from production rule 2 – L → SL’ L’→ ,SL’ | ϵ
Santanu Halder
111 Compiler Design
Left Factoring Left factoring is a grammar transformation that is useful for producing a grammar suitable for predictive or top-down parsing. When the choice between two alternative A-productions is not clear, one may be able to rewrite the productions to defer the decision until enough of the input, has been seen that he/she can make the right choice. For example – A → αβ1 | αβ2 | … | αβn As a solution, rewrite the production rules as – A → αA’ A’→ β1 | β2 … | βn
Santanu Halder
112 Compiler Design
Left Factoring.. A → aBcDE A → aBdEC B→b C→c D→d E→e
Santanu Halder
A → aBcDE B → aBdEC B→b C→c D→d E→e
113 Compiler Design
Left Factoring.. A → aBcDE A → aBdEC B→b C→c D→d E→e Non-deterministic grammar
Santanu Halder
A → aBcDE B → aBdEC B→b C→c D→d E→e Deterministic Grammar
114 Compiler Design
Left Factoring.. A → aBcDE A → aBdEC B→b C→c D→d E→e Non-deterministic grammar Apply left factoring
Santanu Halder
A → aBcDE B → aBdEC B→b C→c D→d E→e Deterministic Grammar Does not require left factoring
115 Compiler Design
Left Factoring.. Example: S → iEtS | iEtSeS | a E→b
Santanu Halder
116 Compiler Design
Left Factoring.. Example: S → iEtS | iEtSeS | a E→b Solution: S → iEtSS’ | a S’→ ϵ | eS E→b
Santanu Halder
117 Compiler Design
Left Factoring.. Example: S → aSSbS | aSaSb | abb | b
Santanu Halder
118 Compiler Design
Left Factoring.. Example: S → aSSbS | aSaSb | abb | b Solution: Step 1: S → aS’ | b S’→ SSbS | SaSb | bb
Santanu Halder
119 Compiler Design
Left Factoring.. Example: S → aSSbS | aSaSb | abb | b Solution: Step 1: S → aS’ | b S’→ SSbS | SaSb | bb Step 2: S’→ SS’’ | bb S’’→ SbS | aSb
Santanu Halder
120 Compiler Design
Left Factoring.. Example: S → aSSbS | aSaSb | abb | b Solution: Step 1: S → aS’ | b S’→ SSbS | SaSb | bb Step 2: S’→ SS’’ | bb S’’→ SbS | aSb Altogether: S → aS’ | b S’→ SS’’ | bb S’’→ SbS | aSb Santanu Halder
121 Compiler Design
Closure Properties of CFL and RL 1. If L1 be a context free language and L2 be a regular language, then L1∩L2 is a context free language. 2. Regular sets are closed under transpose. 3. Regular sets are closed under complementation. 4. Regular sets are closed under intersection. 5. Context free languages are closed under union. 6. Context free languages are closed under concatenation. 7. Context free languages are closed under star closure. 8. Context free languages are not closed under intersection. 9. Context free languages are not closed under complementation.
Santanu Halder
122 Compiler Design
Type of Parser Parsers Top Down Parsers TDP with full Backtracking Brute Force Method
Santanu Halder
Bottom Up Parsers TDP without Backtracking
Recursive Descent
Non Recursive Descent (LL(1))
123 Compiler Design
Type of Parser.. Parsers Top Down Parsers
Bottom Up Parsers
Operator Precedence Parser
LR Parser LR(0)
Santanu Halder
SLR(1)
LALR(1)
CLR(1)
124 Compiler Design
Difference between TDP and BUP S → aABe A → Abc | b B→d
Santanu Halder
w = abbcde
125 Compiler Design
Difference between TDP and BUP S → aABe A → Abc | b B→d
w = abbcde
Top down parser S a
A
Santanu Halder
B
e
126 Compiler Design
Difference between TDP and BUP S → aABe A → Abc | b B→d
w = abbcde
Top down parser S a
A A
b
B
e
c
Santanu Halder
127 Compiler Design
Difference between TDP and BUP S → aABe A → Abc | b B→d
w = abbcde
Top down parser S a
A A
b
B
e
c
b
Santanu Halder
128 Compiler Design
Difference between TDP and BUP S → aABe A → Abc | b B→d
w = abbcde
Top down parser S a
A A
b
B c
e
d
b
Santanu Halder
129 Compiler Design
Difference between TDP and BUP S → aABe A → Abc | b B→d
w = abbcde
Top down parser
Bottom up parser
S a
A A
b
B c
b
Santanu Halder
e
d a
b
b
c
d
e
130 Compiler Design
Difference between TDP and BUP S → aABe A → Abc | b B→d
w = abbcde
Top down parser
Bottom up parser
S a
A A
b
B c
b
Santanu Halder
e
d
A a
b
b
c
d
e
131 Compiler Design
Difference between TDP and BUP S → aABe A → Abc | b B→d
w = abbcde
Top down parser
Bottom up parser
S a
A A
b
B c
b
Santanu Halder
e
A
d
A a
b
b
c
d
e
132 Compiler Design
Difference between TDP and BUP S → aABe A → Abc | b B→d
w = abbcde
Top down parser
Bottom up parser
S a
A A
b
B c
b
Santanu Halder
e
A
d
B
A a
b
b
c
d
e
133 Compiler Design
Difference between TDP and BUP S → aABe A → Abc | b B→d
w = abbcde
Top down parser
Bottom up parser
S a
S
A A
b
B c
b
Santanu Halder
e
A
d
B
A a
b
b
c
d
e
134 Compiler Design
Difference between TDP and BUP S → aABe A → Abc | b B→d
w = abbcde
Top down parser
Bottom up parser
(Left most derivation)
(Right most derivation)
Santanu Halder
135 Compiler Design
Difference between TDP and BUP S → aABe A → Abc | b B→d
w = abbcde
Top down parser
Bottom up parser
(Left most derivation) S => aABe
(Right most derivation) S => aABe
Santanu Halder
136 Compiler Design
Difference between TDP and BUP S → aABe A → Abc | b B→d
w = abbcde
Top down parser
Bottom up parser
(Left most derivation) S => aABe => aAbcBe
(Right most derivation) S => aABe => aAde
Santanu Halder
137 Compiler Design
Difference between TDP and BUP S → aABe A → Abc | b B→d
w = abbcde
Top down parser
Bottom up parser
(Left most derivation) S => aABe => aAbcBe => abbcBe
(Right most derivation) S => aABe => aAde => aAbcde
Santanu Halder
138 Compiler Design
Difference between TDP and BUP S → aABe A → Abc | b B→d
w = abbcde
Top down parser
Bottom up parser
(Left most derivation)
(Reverse of Right most derivation)
S => aABe => aAbcBe => abbcBe => abbcde
Santanu Halder
S => aABe => aAde => aAbcde => abbcde
139 Compiler Design
Difference between TDP and BUP S → aABe A → Abc | b B→d
w = abbcde
Top down parser
Bottom up parser
(Left most derivation)
(Reverse of Right most derivation)
S => aABe => aAbcBe => abbcBe => abbcde
Santanu Halder
S => aABe => aAde => aAbcde => abbcde
140 Compiler Design
Recursive Descent Parser Consists of a set of procedures, one for each nonterminal. Execution begins with the procedure for start symbol. Recursive descent parsers cant be used for left-recursive grammars.
Santanu Halder
141 Compiler Design
Recursive Descent Parser.. E → iE’ E’→ +iE’ | ε
Santanu Halder
142 Compiler Design
Recursive Descent Parser.. E → iE’ E’→ +iE’ | ε E() { if(l == 'i') { match('i'); E’(); } }
Santanu Halder
143 Compiler Design
Recursive Descent Parser.. E → iE’ E’→ +iE’ | ε E() { if(l == 'i') { match('i'); E’(); } }
Santanu Halder
E’() { if(l == '+') { match(+); match(i); E’(); } else return(); } 144 Compiler Design
Recursive Descent Parser.. E → iE’ E’→ +iE’ | ε E() { if(l == 'i') { match('i'); E’(); } }
Santanu Halder
E’() { if(l == '+') { match('+'); match('i'); E’(); } else return(); }
match(char t) { if(l == t) l = getchar(); else printf(“error”); }
145 Compiler Design
Recursive Descent Parser.. E → iE’ E’→ +iE’ | ε E() { if(l == 'i') { match('i'); E’(); } }
Santanu Halder
E’() { if(l == '+') { match('+'); match('i'); E’(); } else return(); }
match(char t) { if(l == t) l = getchar(); else printf(“error”); } main() { E(); if(l=='$') printf(“accept”); } 146 Compiler Design
FIRST and FOLLOW To compute FIRST(X) for all symbols X in grammar G, apply following rules until no more terminals or ɛ can be added to any FIRST set: 1. If X is a terminal, then FIRST(X) = {X}. 2. If X is a non-terminal and X→Y1Y2…Yk is a production for some k≥1, then place a in FIRST(X) if for some i, a is in FIRST(Yi) and ɛ is in all of FIRST(Y1),…,FIRST(Yi-1). If ɛ is in FIRST(Yj) for j=1…k then add ɛ to FIRST(X). 3.If X→ ɛ is a production, then add ɛ to FIRST(X).
Santanu Halder
147 Compiler Design
FIRST and FOLLOW.. To compute FOLLOW(A) for all non-terminals A, apply the following rules until nothing can be added to any FOLLOW set. 1. Place $ in FOLLOW(S) , where S is the start symbol, and $ is the input right end marker . 2. If there is a production A → αBβ, then everything in FIRST(β) except ɛ is in FOLLOW(B). 3. If there is a production A → αB, or a production A →αBβ, where FIRST(β) contains ɛ, then everything in FOLLOW(A) is in FOLLOW(B) .
Santanu Halder
148 Compiler Design
FIRST and FOLLOW.. Example: Find FIRST and FOLLOW for the following grammar. S→ABCDE A→a | ɛ B→b | ɛ C→c D→d | ɛ E→e | ɛ
Santanu Halder
149 Compiler Design
FIRST and FOLLOW.. Solution: S→ABCDE A→a | ɛ B→b | ɛ C→c D→d | ɛ E→e | ɛ
Santanu Halder
S A B C D E
FIRST {a, b, c} {a, ɛ} {b, ɛ} {c} {d, ɛ} {e, ɛ}
FOLLOW {$} {b, c} {c} {d, e, $} {e, $} {$}
150 Compiler Design
FIRST and FOLLOW.. Example: Find FIRST and FOLLOW for the following grammar. S→Bb | Cd B→aB | ɛ C→cC | ɛ
Santanu Halder
151 Compiler Design
FIRST and FOLLOW.. Solution: S→Bb | Cd B→aB | ɛ C→cC | ɛ
Santanu Halder
FIRST S {a, b, c, d} B {a, ɛ} C {c, ɛ}
152 Compiler Design
FIRST and FOLLOW.. Solution: S→Bb | Cd B→aB | ɛ C→cC | ɛ
Santanu Halder
FIRST S {a, b, c, d} B {a, ɛ} C {c, ɛ}
FOLLOW {$} {b} {d}
153 Compiler Design
FIRST and FOLLOW.. Example: Find FIRST and FOLLOW for the following grammar. E → TE’ E’→ +TE’ | ɛ T → FT’ T’→ *FT’ | ɛ F → id | (E)
Santanu Halder
154 Compiler Design
FIRST and FOLLOW.. Solution: E → TE’ E’→ +TE’ | ɛ T → FT’ T’→ *FT’ | ɛ F → id | (E)
Santanu Halder
E E’ T T’ F
FIRST {id, (} {+, ɛ} {id, (} {*, ɛ} {id, (}
155 Compiler Design
FIRST and FOLLOW.. Solution: E → TE’ E’→ +TE’ | ɛ T → FT’ T’→ *FT’ | ɛ F → id | (E)
Santanu Halder
E E’ T T’ F
FIRST {id, (} {+, ɛ} {id, (} {*, ɛ} {id, (}
FOLLOW {$, )} {$, )} {+, $, )} {+, $, )} {*, +, $, )}
156 Compiler Design
FIRST and FOLLOW.. Example: Find FIRST and FOLLOW for the following grammar. S → ACB | CbB | Ba A → da | BC B→g|ɛ C→h|ɛ
Santanu Halder
157 Compiler Design
FIRST and FOLLOW.. Solution: S → ACB | CbB | Ba A → da | BC │ ε B→g|ɛ C→h|ɛ
Santanu Halder
S A B C
FIRST {d, g, h, ɛ, b, a} {d, g, h, ɛ} {g, ɛ} {h, ɛ}
158 Compiler Design
FIRST and FOLLOW.. Solution: S → ACB | CbB | Ba A → da | BC B→g|ɛ C→h|ɛ
Santanu Halder
S A B C
FIRST {d, g, h, ɛ, b, a} {d, g, h, ɛ} {g, ɛ} {h, ɛ}
FOLLOW {$} {h, g, $} {$, a, h, g} {g, $, b, h}
159 Compiler Design
LL(1) Grammar Input buffer LL(1)
$
LL(1) parser $ Stack
Scan input from Left to right
Left most deriva tion
Size of look ahead
LL(1) parsing table
Santanu Halder
160 Compiler Design
LL(1) Grammar An unambiguous, non left recursive grammar G is LL(l) if and only if whenever A →α | β are two distinct productions of G, the following conditions hold: 1. For no terminal a, do both α and β derive strings beginning with a. 2. At most one of α and β can derive the empty string. 3. If , then α does not derive any string beginning with a terminal in FOLLOW(A) . Likewise, if then β does not derive any string beginning with a terminal in FOLLOW(A).
Santanu Halder
161 Compiler Design
Parser table for LL(1) Method: For each production A→α in the grammar, do the following to construct a parser table – 1.For each terminal a in FIRST(A), add A→α to M[A, a]. 2.If ε is in FIRST(α), then for each terminal b in FOLLOW(A), add A→α to M[A, b]. If ε is in FIRST(α), and $ is in FOLLOW(A), then add A→α to M[A, $] as well.
Santanu Halder
162 Compiler Design
Parser table for LL(1).. E → TE’ E’→ +TE’ | ɛ T → FT’ T’→ *FT’ | ɛ F → id | (E)
Santanu Halder
E E’ T T’ F
FIRST {id, (} {+, ɛ} {id, (} {*, ɛ} {id, (}
FOLLOW {$, )} {$, )} {+, $, )} {+, $, )} {*, +, $, )}
163 Compiler Design
Parser table for LL(1).. E → TE’ E’→ +TE’ | ɛ T → FT’ T’→ *FT’ | ɛ F → id | (E) id E
FOLLOW {$, )} {$, )} {+, $, )} {+, $, )} {*, +, $, )} ( )
$
E→TE’ E’→+TE’
T→FT’
T’ F
+
E→TE’
E’ T
E E’ T T’ F
FIRST {id, (} {+, ɛ} {id, (} {*, ɛ} {id, (} *
E’→ε
E’→ε
T’→ ε
T’→ ε
T→FT’ T’→ε
F→id
Santanu Halder
T’→*FT’ F→(E)
164 Compiler Design
Parser table for LL(1).. S → (S) | ɛ
Santanu Halder
FIRST S {(, ɛ}
FOLLOW {$, )}
165 Compiler Design
Parser table for LL(1).. S → (S) | ɛ
FIRST S {(, ɛ}
( ) S S→(S) S→ε
Santanu Halder
FOLLOW {$, )}
$ S→ε
166 Compiler Design
Table driven LL(1) parsing Algo INPUT: A string w and a parsing table M for grammar G. OUTPUT: If w is in L(G) , a leftmost derivation of w; otherwise, an error indication. METHOD: set ip to point to the first symbol of w; set X to the top stack symbol; While(X≠$) { // a is current symbol of w. if ( X is a ) {pop the stack and advance ip;} elseif ( X is a terminal ) error(); elseif ( M [X, a] is an error entry ) {error() ;} elseif ( M[X, a] = X→Y1Y2…Yk) { output the production X →Y1Y2…Yk; pop the stack; push Yk, Yk -l, ... , Y1 onto the stack, with Y1 on top; } set X to the top stack symbol; }
Santanu Halder
167 Compiler Design
Parsing for LL(1) S → (S) | ɛ
FIRST S {(, ɛ}
( ) S S→(S) S→ε
FOLLOW {$, )}
$ S→ε
Show the sequence of moves to parse the string (()) using the above parse table.
Santanu Halder
168 Compiler Design
Parsing for LL(1).. ( ) S S→(S) S→ε Matched
Stack S$
Santanu Halder
w= (()) Input (())$
$ S→ε Action
169 Compiler Design
Parsing for LL(1).. ( ) S S→(S) S→ε Matched
Stack S$ (S)$
Santanu Halder
w= (()) Input (())$ (())$
$ S→ε Action output S→(S)
170 Compiler Design
Parsing for LL(1).. ( ) S S→(S) S→ε Matched (
Stack S$ (S)$ S)$
Santanu Halder
w= (()) Input (())$ (())$ ())$
$ S→ε Action output S→(S) match '('
171 Compiler Design
Parsing for LL(1).. ( ) S S→(S) S→ε Matched ( (
Stack S$ (S)$ S)$ (S))$
Santanu Halder
w= (()) Input (())$ (())$ ())$ ())$
$ S→ε Action output S→(S) match '(' output S→(S)
172 Compiler Design
Parsing for LL(1).. ( ) S S→(S) S→ε Matched ( ( ((
Stack S$ (S)$ S)$ (S))$ S))$
Santanu Halder
w= (()) Input (())$ (())$ ())$ ())$ ))$
$ S→ε Action output S→(S) match '(' output S→(S) match '('
173 Compiler Design
Parsing for LL(1).. ( ) S S→(S) S→ε Matched ( ( (( ((
Stack S$ (S)$ S)$ (S))$ S))$ ))$
Santanu Halder
w= (()) Input (())$ (())$ ())$ ())$ ))$ ))$
$ S→ε Action output S→(S) match '(' output S→(S) match '(‘ output S→ε
174 Compiler Design
Parsing for LL(1).. ( ) S S→(S) S→ε Matched ( ( (( (( (()
Stack S$ (S)$ S)$ (S))$ S))$ ))$ )$
Santanu Halder
w= (()) Input (())$ (())$ ())$ ())$ ))$ ))$ )$
$ S→ε Action output S→(S) match '(' output S→(S) match '(‘ output S→ε match ')'
175 Compiler Design
Parsing for LL(1).. ( ) S S→(S) S→ε Matched ( ( (( (( (() (())
Stack S$ (S)$ S)$ (S))$ S))$ ))$ )$ $
Santanu Halder
w= (()) Input (())$ (())$ ())$ ())$ ))$ ))$ )$ $
$ S→ε Action output S→(S) match '(' output S→(S) match '(' output S→ε match ')' match ')' 176 Compiler Design
Shift Reduce Parser The general idea is to shift some symbols of input to the stack until a reduction can be applied. At each reduction step, a specific substring matching the body of a production is replaced by the non-terminal at the head of the production. The key decisions during bottom-up parsing are about when to reduce and about what production to apply. A reduction is a reverse of a step in a derivation. The goal of a bottom-up parser is to construct a derivation in reverse: E=>T=>T*F=>T*id=>F*id=>id*id
Santanu Halder
177 Compiler Design
Handle Pruning A Handle is a substring that matches the body of a production and whose reduction represents one step along the reverse of a rightmost derivation. Example: Right Sentential Form id*id F*id T*id T*F
Santanu Halder
Handle id F id T*F
Reducing Production F→id T→F F→id E→T*F
178 Compiler Design
Shift Reduce Parsing A stack is used to hold grammar symbols. Handle always appear on top of the stack. Initial configuration: Stack Input $ w$ Acceptance configuration: Stack Input $S $
Santanu Halder
179 Compiler Design
Operator Precedence Grammar An operator precedence grammar is a context-free grammar that has the property that no production has either an empty right-hand side or two adjacent nonterminals in its right-hand side. Or a CFG which is used to define the mathematical operators is called an operator precedence grammar. Example: E→E+E | E*E | id Rely on the following three precedence relations between terminals: Relation Meaning a b a takes precedence over b Santanu Halder
180 Compiler Design
Operator Precedence Parser An operator precedence parser is a bottom-up parser that interprets an operator-precedence grammar. Able to parse an ambiguous grammar. Simple enough to write by hand. Can be written to consult an operator table at run time. Not used often in practice.
Santanu Halder
181 Compiler Design
Operator Precedence Parser.. E→E+E | E*E | id Terminals are +, *, id, $. Operator precedence relation table: 1. id has the highest precedence. 2. $ has the lowest precedence. 3. Both + and * are left-associative (hence +·>+ and *·>*) 4. Need not compare between id and id because of two adjacent non-terminals are not present in production rules.
Santanu Halder
182 Compiler Design
Operator Precedence Parser.. E→E+E | E*E | id
id id +
·> ·> 183 Compiler Design
Operator Precedence Parser.. Algorithm for parsing Initialize: Set ip to point to the first symbol of w$. Repeat: If $ is on the top of the stack and ip points to $ then return else Let a be the top terminal on the stack, and b the symbol pointed to by ip; if a b then repeat pop the stack; until (the top stack terminal is related by ·> ·> +
ip
$
+
a id
+
id
*
*
·> ·>
$
·> +
ip
id
+
*
·> ·>
$
·>
$
·> +
ip
id
+
a
*
·> ·>
$
·>
$
·> +
ip id
+
a E
*
·> ·>
$
·>
$
·>
$
·>
$
+
E
ip
+
id
*
·> ·>
$
b) then begin pop 2*|b| symbols off the stack; let s' be the state on top of the stack; push A then goto[s', A] on top of the stack; output the production A->b; end else if (action[s, a] = accept) then return else error(); end
Santanu Halder
212 Compiler Design
LR(0) Parsing Line Stack 1
0
Input aabb$
Action
Action States 0
a
b
s3
s4
1
Santanu Halder
Goto $
A
S
2
1
acc ept
2
s3
s4
5
3
s3
s4
6
4
r3
r3
r3
5
r1
r1
r1
6
r2
r2
r2
213 Compiler Design
LR(0) Parsing Line Stack
Input
Action
Action
1
0
aabb$ Shift to 3
2
0a3
abb$
States 0
a
b
s3
s4
1
Santanu Halder
Goto $
A
S
2
1
acc ept
2
s3
s4
5
3
s3
s4
6
4
r3
r3
r3
5
r1
r1
r1
6
r2
r2
r2
214 Compiler Design
LR(0) Parsing Line Stack
Input
Action
Action
1
0
aabb$ Shift to 3
2
0a3
abb$
3
0a3a3
bb$
Shift to 3
States 0
a
b
s3
s4
1
Santanu Halder
Goto $
A
S
2
1
acc ept
2
s3
s4
5
3
s3
s4
6
4
r3
r3
r3
5
r1
r1
r1
6
r2
r2
r2
215 Compiler Design
LR(0) Parsing Line Stack
Input
1
0
aabb$ Shift to 3
2
0a3
abb$
Shift to 3
3
0a3a3
bb$
Shift to 4
4
0a3a3b4
b$
Santanu Halder
Action
Action States 0
a
b
s3
s4
1
Goto $
A
S
2
1
acc ept
2
s3
s4
5
3
s3
s4
6
4
r3
r3
r3
5
r1
r1
r1
6
r2
r2
r2
216 Compiler Design
LR(0) Parsing Line Stack
Input
1
0
aabb$ Shift to 3
2
0a3
abb$
Shift to 3
3
0a3a3
bb$
Shift to 4
4
0a3a3b4
b$
Reduce by A→b
5
0a3a3A6
b$
Reduce by A→b means pop 2*|b|=2 symbols from stack which is here 4 and b. Then push A onto the stack and as the previous symbol is 3 in the stack, then perform goto(3,A) which gives 6. Also push 6 onto the stack and don’t forward the input pointer.
Santanu Halder
Action
Action States 0
a
b
s3
s4
1
Goto $
A
S
2
1
acc ept
2
s3
s4
5
3
s3
s4
6
4
r3
r3
r3
5
r1
r1
r1
6
r2
r2
r2
217 Compiler Design
LR(0) Parsing Line Stack
Input
1
0
aabb$ Shift to 3
2
0a3
abb$
Shift to 3
3
0a3a3
bb$
Shift to 4
4
0a3a3b4
b$
Reduce by A→b
5
0a3a3A6
b$
Reduce by A→aA
6
0a3A6
b$
Santanu Halder
Action
Action States 0
a
b
s3
s4
1
Goto $
A
S
2
1
acc ept
2
s3
s4
5
3
s3
s4
6
4
r3
r3
r3
5
r1
r1
r1
6
r2
r2
r2
218 Compiler Design
LR(0) Parsing Line Stack
Input
1
0
aabb$ Shift to 3
2
0a3
abb$
Shift to 3
3
0a3a3
bb$
Shift to 4
4
0a3a3b4
b$
Reduce by A→b
5
0a3a3A6
b$
Reduce by A→aA
6
0a3A6
b$
Reduce by A→aA
7
0A2
b$
Santanu Halder
Action
Action States 0
a
b
s3
s4
1
Goto $
A
S
2
1
acc ept
2
s3
s4
5
3
s3
s4
6
4
r3
r3
r3
5
r1
r1
r1
6
r2
r2
r2
219 Compiler Design
LR(0) Parsing Line Stack
Input
1
0
aabb$ Shift to 3
2
0a3
abb$
Shift to 3
3
0a3a3
bb$
Shift to 4
4
0a3a3b4
b$
Reduce by A→b
5
0a3a3A6
b$
Reduce by A→aA
6
0a3A6
b$
Reduce by A→aA
7
0A2
b$
Shift to 4
8
0A2b4
$
Santanu Halder
Action
Action States 0
a
b
s3
s4
1
Goto $
A
S
2
1
acc ept
2
s3
s4
5
3
s3
s4
6
4
r3
r3
r3
5
r1
r1
r1
6
r2
r2
r2
220 Compiler Design
LR(0) Parsing Line Stack
Input
1
0
aabb$ Shift to 3
2
0a3
abb$
Shift to 3
3
0a3a3
bb$
Shift to 4
4
0a3a3b4
b$
Reduce by A→b
5
0a3a3A6
b$
Reduce by A→aA
6
0a3A6
b$
Reduce by A→aA
7
0A2
b$
Shift to 4
8
0A2b4
$
Reduce by A→b
9
0A2A5
$
Santanu Halder
Action
Action States 0
a
b
s3
s4
1
Goto $
A
S
2
1
acc ept
2
s3
s4
5
3
s3
s4
6
4
r3
r3
r3
5
r1
r1
r1
6
r2
r2
r2
221 Compiler Design
LR(0) Parsing Line Stack
Input
1
0
aabb$ Shift to 3
2
0a3
abb$
Shift to 3
3
0a3a3
bb$
Shift to 4
4
0a3a3b4
b$
Reduce by A→b
5
0a3a3A6
b$
Reduce by A→aA
6
0a3A6
b$
Reduce by A→aA
7
0A2
b$
Shift to 4
8
0A2b4
$
9
0A2A5
$
10
0S1
$
Santanu Halder
Action
Action States 0
a
b
s3
s4
1
Goto $
A
S
2
1
acc ept
2
s3
s4
5
3
s3
s4
6
Reduce by A→b
4
r3
r3
r3
Reduce by S→AA
5
r1
r1
r1
6
r2
r2
r2
222 Compiler Design
LR(0) Parsing Line Stack
Input
Action
Action
1
0
aabb$ Shift to 3
2
0a3
abb$
Shift to 3
3
0a3a3
bb$
Shift to 4
4
0a3a3b4
b$
Reduce by A→b
5
0a3a3A6
b$
Reduce by A→aA
6
0a3A6
b$
Reduce by A→aA
7
0A2
b$
Shift to 4
8
0A2b4
$
9
0A2A5
10
0S1
States 0
a
b
s3
s4
1
Goto $
A
S
2
1
acc ept
2
s3
s4
5
3
s3
s4
6
Reduce by A→b
4
r3
r3
r3
$
Reduce by S→AA
5
r1
r1
r1
$
Accept
6
r2
r2
r2
Santanu Halder
223 Compiler Design
SLR(1) Parsing Table 1. Construct C = {I0 , I1 , . . . , In } , the collection of sets of LR(0) items for the augmented grammar G' . 2. State i is constructed from Ii . The parsing actions for state i are determined as follows: a) If [A →α.Aβ] is in Ii and GOTO (Ii, a) = Ij , then set ACTION[i, a] to "shift j“. Here a must be a terminal. b) If [A→α.] is in Ii , then set ACTION[i, a] to "reduce A→α" for all a in FOLLOW(A); here A may not be S' . c) If [S'→S·] is in Ii , then set ACTION[i, $] to "accept“. If any conflicting actions result from the above rules, we say the grammar is not SLR(1) . The algorithm fails to produce a parser in this case. 3. The goto transitions for state i are constructed for all nonterminals A using the rule: If GOTO (Ii, A) = Ij , then GOTo[i, A] = j . 4. All entries not defined by rules (2) and (3) are made "error″. 5. The initial state of the parser is the one constructed from the set of items containing [S'→.S] .
Santanu Halder
224 Compiler Design
SLR(1) Parsing Table.. S'→ S S → AA A → aA A→b
1 2 3
States 0
a s3
Action b s4
1 First Follow S {a,b} {$} A {a,b} {a,b,$}
2
2
s3
s4
5
3
s3
s4
6
4
r3
r3
6
1
accept
5
Santanu Halder
$
Goto A S
r3 r1
r2
r2
r2 225 Compiler Design
SLR(1) Grammar Every SLR(1) grammar is unambiguous, but there are many unambiguous grammars that are not SLR(1). Consider the grammar with productions: S → L=R | R L → *R | id R→L
Santanu Halder
226 Compiler Design
Sets-of-LR(1)-items construction
Santanu Halder
227 Compiler Design
Sets-of-LR(1)-items construction..
Santanu Halder
228 Compiler Design
Sets-of-LR(1)-items construction.. Example: S → AA A → aA | b
Santanu Halder
229 Compiler Design
Sets-of-LR(1)-items construction.. Example: S → AA A → aA | b Add one more production rule: So, now the grammar is: S'→ S S → AA A → aA | b
Santanu Halder
S' → S
230 Compiler Design
Sets-of-LR(1)-items construction.. I5
I1 S
I0
S'→S.,$
I2
S'→.S,$ A S→A.A,$ S→.AA,$ A→.aA,$ A→.aA,a/b A→.b,$ A→.b,a/b a
a b
I3
A→a.A,a/b A→.aA,a/b A→.b,a/b
S→AA.,$
A
I6 A
I9 A→aA.,$
A→a.A,$ a A→.aA,$ A→.b,$
b
a
I7 b A→b.,$
A
I8 A→aA.,a/b
I4 b A→b.,a/b
Santanu Halder
231 Compiler Design
Sets-of-LR(1)-items construction.. The I9 state S'→S.,$ S→AA.,$ S I0 A I2 I6 A A→aA.,$ pairs (I4, I7), (I8, S'→.S,$ S→A.A,$ A→a.A,$ A a I9) and a A→.aA,$ S→.AA,$ A→.aA,$ (I3,I6) A→.aA,a/b A→.b,$ A→.b,$ have the A→.b,a/b b I7 b a I3 same A→b.,$ a A→a.A,a/b LR(0) I A A→.aA,a/b b 8 with A→.b,a/b A→aA.,a/b different I4 b Lookahead A→b.,a/b I1
Santanu Halder
I5
232 Compiler Design
Canonical LR(1) Parsing Table
Santanu Halder
233 Compiler Design
Canonical LR(1) Parsing Table.. States 0 1 2 3 4 5 6 7 8 9
a s3
Action b s4
Goto $
A 2
S 1
accept s6 s3 r3
s7 s4 r3
5 8 r1
s6
s7
9 r3
r2
Santanu Halder
r2 r2 234 Compiler Design
Canonical LR(1) Parsing Table.. States 0 1 2 3 4 5 6 7 8 9
a s3
Action b s4
Goto $
A 2
accept s6 s3 r3
s7 s4 r3
5 8 r1
s6
s7
9 r3
r2
Santanu Halder
r2 r2
S 1
As the state pairs (I4, I7), (I8, I9) and (I3, I6) have the same LR(0), merge the rows in the parse table and get the new parse table as LALR(1) parsing table 235 Compiler Design
LALR(1) Parsing Table States 0 1 2 36 47 5 89
a s36
Action b s47
Goto $
A 2
accept s36 s36 r3
s47 s47 r3
r2
r2
Santanu Halder
5 89
S 1
Renaming State 3 and 6 as (36), State 4 and 7 as (47), State 8 and 9 as (89).
r3 r1 r2
236 Compiler Design
End of Unit III
Santanu Halder
237 Compiler Design