3-Syntax Analyzer PDF

UNIT - III Syntax Analyzer Santanu Halder 1 Compiler Design Role of Parser Santanu Halder 2 Compiler Design Role

Views 71 Downloads 0 File size 1MB

Report DMCA / Copyright

DOWNLOAD FILE

Recommend stories

Citation preview

UNIT - III Syntax Analyzer Santanu Halder

1 Compiler Design

Role of Parser

Santanu Halder

2 Compiler Design

Role of Syntax Analyzer if (b == 0) a = b; Lexical Analysis or Scanner if

(

b

==

0

)

a

=

b

;

Syntax Analysis or Parsing if == b Santanu Halder

abstract syntax tree or parse tree

= 0

a

b 3 Compiler Design

Grammar Definition: Generally a grammar is represented as 4-tuples – G = Where V = finite set of variables (or non-terminals) T = finite set of terminal symbols P = finite set of productions S = Starting symbol Example: G = ({S}, {a, b}, {S→aSb, S→ab}, S) What language does this grammar accept?

Santanu Halder

4 Compiler Design

Grammar.. Definition: Generally a grammar is represented as 4-tuples – G = Where V = finite set of variables (or non-terminals) T = finite set of terminal symbols P = finite set of productions S = Starting symbol Example: G = ({S}, {a, b}, {S→aSb, S→ab}, S) What language does this grammar accept? L(G) = {anbn | n≥1}

Santanu Halder

5 Compiler Design

Chomsky Classification Type 0 or Unrestricted grammar: A grammar G = (V, T, P, S) is said to be unrestricted if all the production rules are of the form – A→α where A is in (VUT)+ and α is in (VUT)*

Santanu Halder

6 Compiler Design

Chomsky Classification.. Type 0 or Unrestricted grammar: A grammar G = (V, T, P, S) is said to be unrestricted if all the production rules are of the form – A→α where A is in (VUT)+ and α is in (VUT)* Example: S→S1B S1→aS1B bB→bbbB aS1b→aa B→ϵ

Santanu Halder

7 Compiler Design

Chomsky Classification.. Type 0 or Unrestricted grammar: A grammar G = (V, T, P, S) is said to be unrestricted if all the production rules are of the form – A→α where A is in (VUT)+ and α is in (VUT)* Example: S→S1B S1→aS1B bB→bbbB aS1b→aa B→ϵ L(G) = {an+1bn+k | n≥1, k=-1,1,3….} Santanu Halder

8 Compiler Design

Chomsky Classification.. Type 1 or Context-Sensitive grammar: A grammar G = (V, T, P, S) is said to be context-sensitive grammar if all the production rules are of the form – A→α where A, α is in (VUT)+ and |A| ≤ |α|

Santanu Halder

9 Compiler Design

Chomsky Classification.. Type 1 or Context-Sensitive grammar: A grammar G = (V, T, P, S) is said to be context-sensitive grammar if all the production rules are of the form – A→α where A, α is in (VUT)+ and |A| ≤ |α| Example: S→abc | aAbc Ab→bA Ac→Bbcc bB→Bb aB→aa | aaA

Santanu Halder

10 Compiler Design

Chomsky Classification.. Type 1 or Context-Sensitive grammar: A grammar G = (V, T, P, S) is said to be context-sensitive grammar if all the production rules are of the form – A→α where A, α is in (VUT)+ and |A| ≤ |α| Example: S→abc | aAbc Ab→bA Ac→Bbcc bB→Bb aB→aa | aaA L(G) = {anbncn | n≥1} Santanu Halder

11 Compiler Design

Chomsky Classification.. Type 2 or Context-Free grammar: A grammar G = (V, T, P, S) is said to be context-free grammar if all the production rules are of the form – A→α where A is in V and α is in (VUT)*

Santanu Halder

12 Compiler Design

Chomsky Classification.. Type 2 or Context-Free grammar: A grammar G = (V, T, P, S) is said to be context-free grammar if all the production rules are of the form – A→α where A is in V and α is in (VUT)* Example: S→aSa S→bSb S→ϵ

Santanu Halder

13 Compiler Design

Chomsky Classification.. Type 2 or Context-Free grammar: A grammar G = (V, T, P, S) is said to be context-free grammar if all the production rules are of the form – A→α where A is in V and α is in (VUT)* Example: S→aSa S→bSb S→ϵ L(G) = {WWR | W is in {a, b}*}

Santanu Halder

14 Compiler Design

Chomsky Classification.. Type 3 or Regular Grammar: A grammar G = (V, T, P, S) is said to be right linear grammar if all the production rules are of the form – A→xB or A→x where A, B is in V and x is in T* A grammar G = (V, T, P, S) is said to be left linear grammar if all the production rules are of the form – A→Bx or A→x where A, B is in V and x is in T* A regular grammar is one that is either left linear or right linear.

Santanu Halder

15 Compiler Design

Chomsky Classification.. Example: Right linear S→abS | a

Santanu Halder

L(G) = {(ab)*a}

16 Compiler Design

Chomsky Classification.. Example: Right linear S→abS | a Left Linear S→S1ab S1→S1ab | S2 S2→a

Santanu Halder

L(G) = {(ab)*a} L(G) = {aab(ab)*}

17 Compiler Design

Chomsky Classification.. Example: Right linear S→abS | a Left Linear S→S1ab S1→S1ab | S2 S2→a Linear grammar S→A A→aB | ϵ B→Ab

Santanu Halder

L(G) = {(ab)*a} L(G) = {aab(ab)*}

L(G) = {anbn | n ≥ 0}

18 Compiler Design

Context Free Grammar 1. Generate a CFG for L(G) = {anbn | n≥1}

Santanu Halder

19 Compiler Design

Context Free Grammar 1. Generate a CFG for L(G) = {anbn | n≥1} G = (V, T, P, S) V = {S}, T = {a, b}, P = {S→aSb, S→ab}

Santanu Halder

20 Compiler Design

Context Free Grammar 1. Generate a CFG for L(G) = {anbn | n≥1} G = (V, T, P, S) V = {S}, T = {a, b}, P = {S→aSb, S→ab} 2. Construct a CFG to generate the set of palindromes over alphabet {a, b}.

Santanu Halder

21 Compiler Design

Context Free Grammar 1. Generate a CFG for L(G) = {anbn | n≥1} G = (V, T, P, S) V = {S}, T = {a, b}, P = {S→aSb, S→ab} 2. Construct a CFG to generate the set of palindromes over alphabet {a, b}. G = (V, T, P, S) V = {S}, T = {a, b}, And for even palindrome – P = {S→aSa, S→bSb, S→ ε} And for odd palindrome – P = {S→aSa, S→bSb, S→a, S→b}

Santanu Halder

22 Compiler Design

Context Free Grammar 3. Design a CFG for a given language L(G) = {aibi | i≥0}.

Santanu Halder

23 Compiler Design

Context Free Grammar 3. Design a CFG for a given language L(G) = {aibi | i≥0}. G = (V, T, P, S) V = {S}, T = {a, b}, P = {S→aSb, S→ ε} 4. Design a CFG for L(G) = {aibicjdj | i,j≥0}.

Santanu Halder

24 Compiler Design

Context Free Grammar 3. Design a CFG for a given language L(G) = {aibi | i≥0}. G = (V, T, P, S) V = {S}, T = {a, b}, P = {S→aSb, S→ ε} 4. Design a CFG for L(G) = {aibicjdj | i,j≥0}. Let, L1 = {aibi | i ≥ 0} and L2 = {cjdj | j ≥ 0} Hence L(G) = L1·L2 L1 can be expressed as {A→aAb | ε} L2 can be expressed as {B→cBd | ε} Finally, P = {S→AB, A→aAb | ε, B→cBd | ε}

Santanu Halder

25 Compiler Design

Context Free Grammar 5. Generate a CFG for L(G) = {0i1j0k | j>i+k and i,k≥0}.

Santanu Halder

26 Compiler Design

Context Free Grammar 5. Generate a CFG for L(G) = {0i1j0k | j>i+k and i,k≥0}. L(G) = 0i1j0k for i, k ≥ 0 and j > i+k = 0i1i1m1k0k for m > 0

Santanu Halder

27 Compiler Design

Context Free Grammar 5. Generate a CFG for L(G) = {0i1j0k | j>i+k and i,k≥0}. L(G) = 0i1j0k for i, k ≥ 0 and j > i+k = 0i1i1m1k0k for m > 0 This can be rewritten as – L1 = {0i1i | i ≥ 0} L2 = {1m | m >0} L3 = {1k0k| k ≥ 0} And hence – L(G) = L1·L2·L3

Santanu Halder

28 Compiler Design

Context Free Grammar 5. Generate a CFG for L(G) = {0i1j0k | j>i+k and i,k≥0}. L(G) = 0i1j0k for i, k ≥ 0 and j > i+k = 0i1i1m1k0k for m > 0 This can be rewritten as – L1 = {0i1i | i ≥ 0} L2 = {1m | m >0} L3 = {1k0k| k ≥ 0} And hence – L(G) = L1·L2·L3 L1 can be expressed as {A→0A1 | ε} L2 can be expressed as {B→1B | 1} L3 can be expressed as {C→1C0 | ε} Finally, P = {S→ABC, A→0A1| ε, B→1B|1, C→1C0 | ε} Santanu Halder

29 Compiler Design

Context Free Grammar 6. Design a CFG for L(G) = {e5aibjcjdi | i,j>0}.

Santanu Halder

30 Compiler Design

Context Free Grammar 6. Design a CFG for L(G) = {e5aibjcjdi | i,j>0}. G = (V, T, P, S) V = {S, S1, S2, S3}, T = {a, b, c, d}, P = { S→e5aS1d, S1→ aS2d | bc, S2→aS2d | bS3c | bc, S3→bS3c | bc}

Santanu Halder

31 Compiler Design

Context Free Grammar 6. Design a CFG for L(G) = {e5aibjcjdi | i,j>0}. G = (V, T, P, S) V = {S, S1, S2, S3}, T = {a, b, c, d}, P = { S→e5aS1d, S1→ aS2d | bc, S2→aS2d | bS3c | bc, S3→bS3c | bc} 7. Design a CFG for L(G) = {WWR | W is a binary string}.

Santanu Halder

32 Compiler Design

Context Free Grammar 6. Design a CFG for L(G) = {e5aibjcjdi | i,j>0}. G = (V, T, P, S) V = {S, S1, S2, S3}, T = {a, b, c, d}, P = { S→e5aS1d, S1→ aS2d | bc, S2→aS2d | bS3c | bc, S3→bS3c | bc} 7. Design a CFG for L(G) = {WWR | W is a binary string}. G = (V, T, P, S) V = {S}, T = {0, 1}, P = {S→0S0 | 1S1 | ε}

Santanu Halder

33 Compiler Design

Context Free Grammar 8. Design a CFG for L(G) = (a+b)*.

Santanu Halder

34 Compiler Design

Context Free Grammar 8. Design a CFG for L(G) = (a+b)*. G = (V, T, P, S) V = {S}, T = {a, b}, P = {S→aS | Sb | a | b | ε}

Santanu Halder

35 Compiler Design

Context Free Grammar 8. Design a CFG for L(G) = (a+b)*. G = (V, T, P, S) V = {S}, T = {a, b}, P = {S→aS | Sb | a | b | ε} 9. Design a CFG for L(G) = {anbm | n≠m}.

Santanu Halder

36 Compiler Design

Context Free Grammar 8. Design a CFG for L(G) = (a+b)*. G = (V, T, P, S) V = {S}, T = {a, b}, P = {S→aS | Sb | a | b | ε} 9. Design a CFG for L(G) = {anbm | n≠m}. Case 1: n < m L = {anbnbi | i > 0} L1 = {anbn} L2 = {bi | i > 0}

Santanu Halder

37 Compiler Design

Context Free Grammar 8. Design a CFG for L(G) = (a+b)*. G = (V, T, P, S) V = {S}, T = {a, b}, P = {S→aS | Sb | a | b | ε} 9. Design a CFG for L(G) = {anbm | n≠m}. Case 1: n < m L = {anbnbi | i > 0} L1 = {anbn} L2 = {bi | i > 0} Hence, L1 can be expressed as : {A→aAb | ε} L2 can be expressed as : {B→bB | b} So, P1 = {S1→AB, A→aAb | ε, B→bB | b} Santanu Halder

38 Compiler Design

Context Free Grammar Case 2: n > m L = {aianbn | i > 0} L1 = {ai | i > 0} L2 = {anbn | i > 0}

Santanu Halder

39 Compiler Design

Context Free Grammar Case 2: n > m L = {aianbn | i > 0} L1 = {ai | i > 0} L2 = {anbn | i > 0} Hence, L1 can be expressed as : {C→aC | a} L2 can be expressed as : {D→aDb | ε} So, P2 = {S2→CD, C→aC | a, D→aDb | ε }

Santanu Halder

40 Compiler Design

Context Free Grammar Case 2: n > m L = {aianbn | i > 0} L1 = {ai | i > 0} L2 = {anbn | i > 0} Hence, L1 can be expressed as : {C→aC | a} L2 can be expressed as : {D→aDb | ε} So, P2 = {S2→CD, C→aC | a, D→aDb | ε } Finally – P = { S→S1 | S2, S1→AB, A→aAb | ε, B→bB | b S2→CD, C→aC | a, D→aDb | ε } Santanu Halder

41 Compiler Design

Leftmost Derivation A derivation is said to be leftmost if in each step of the derivation, the leftmost variable in the sentential form is replaced.

Santanu Halder

42 Compiler Design

Leftmost Derivation.. A derivation is said to be leftmost if in each step of the derivation, the leftmost variable in the sentential form is replaced. Example: Consider the grammar with productions – S→aAB A→bBb B→A | ϵ Leftmost derivation for the string abbbb is –

Santanu Halder

43 Compiler Design

Leftmost Derivation.. A derivation is said to be leftmost if in each step of the derivation, the leftmost variable in the sentential form is replaced. Example: Consider the grammar with productions – S→aAB A→bBb B→A | ϵ Leftmost derivation for the string abbbb is – S => aAB S→aAB

Santanu Halder

44 Compiler Design

Leftmost Derivation.. A derivation is said to be leftmost if in each step of the derivation, the leftmost variable in the sentential form is replaced. Example: Consider the grammar with productions – S→aAB A→bBb B→A | ϵ Leftmost derivation for the string abbbb is – S => aAB S→aAB => abBbB A→bBb

Santanu Halder

45 Compiler Design

Leftmost Derivation.. A derivation is said to be leftmost if in each step of the derivation, the leftmost variable in the sentential form is replaced. Example: Consider the grammar with productions – S→aAB A→bBb B→A | ϵ Leftmost derivation for the string abbbb is – S => aAB S→aAB => abBbB A→bBb => abAbB B→A

Santanu Halder

46 Compiler Design

Leftmost Derivation.. A derivation is said to be leftmost if in each step of the derivation, the leftmost variable in the sentential form is replaced. Example: Consider the grammar with productions – S→aAB A→bBb B→A | ϵ Leftmost derivation for the string abbbb is – S => aAB S→aAB => abBbB A→bBb => abAbB B→A => abbBbbB A→bBb

Santanu Halder

47 Compiler Design

Leftmost Derivation.. A derivation is said to be leftmost if in each step of the derivation, the leftmost variable in the sentential form is replaced. Example: Consider the grammar with productions – S→aAB A→bBb B→A | ϵ Leftmost derivation for the string abbbb is – S => aAB S→aAB => abBbB A→bBb => abAbB B→A => abbBbbB A→bBb => abbbbB B→ ϵ

Santanu Halder

48 Compiler Design

Leftmost Derivation.. A derivation is said to be leftmost if in each step of the derivation, the leftmost variable in the sentential form is replaced. Example: Consider the grammar with productions – S→aAB A→bBb B→A | ϵ Leftmost derivation for the string abbbb is – S => aAB S→aAB => abBbB A→bBb => abAbB B→A => abbBbbB A→bBb => abbbbB B→ ϵ => abbbb B→ ϵ

Santanu Halder

49 Compiler Design

Rightmost Derivation A derivation is said to be rightmost if in each step of the derivation, the rightmost variable in the sentential form is replaced.

Santanu Halder

50 Compiler Design

Rightmost Derivation.. A derivation is said to be rightmost if in each step of the derivation, the rightmost variable in the sentential form is replaced. Example: Consider the grammar with productions – S→aAB A→bBb B→A | ϵ Rightmost derivation for the string abbbb is –

Santanu Halder

51 Compiler Design

Rightmost Derivation.. A derivation is said to be rightmost if in each step of the derivation, the rightmost variable in the sentential form is replaced. Example: Consider the grammar with productions – S→aAB A→bBb B→A | ϵ Rightmost derivation for the string abbbb is – S => aAB S→aAB

Santanu Halder

52 Compiler Design

Rightmost Derivation.. A derivation is said to be rightmost if in each step of the derivation, the rightmost variable in the sentential form is replaced. Example: Consider the grammar with productions – S→aAB A→bBb B→A | ϵ Rightmost derivation for the string abbbb is – S => aAB S→aAB => aAA B→A

Santanu Halder

53 Compiler Design

Rightmost Derivation.. A derivation is said to be rightmost if in each step of the derivation, the rightmost variable in the sentential form is replaced. Example: Consider the grammar with productions – S→aAB A→bBb B→A | ϵ Rightmost derivation for the string abbbb is – S => aAB S→aAB => aAA B→A => aAbBb A→bBb

Santanu Halder

54 Compiler Design

Rightmost Derivation.. A derivation is said to be rightmost if in each step of the derivation, the rightmost variable in the sentential form is replaced. Example: Consider the grammar with productions – S→aAB A→bBb B→A | ϵ Rightmost derivation for the string abbbb is – S => aAB S→aAB => aAA B→A => aAbBb A→bBb => aAbb B→ ϵ

Santanu Halder

55 Compiler Design

Rightmost Derivation.. A derivation is said to be rightmost if in each step of the derivation, the rightmost variable in the sentential form is replaced. Example: Consider the grammar with productions – S→aAB A→bBb B→A | ϵ Rightmost derivation for the string abbbb is – S => aAB S→aAB => aAA B→A => aAbBb A→bBb => aAbb B→ ϵ => abBbbb A→bBb

Santanu Halder

56 Compiler Design

Rightmost Derivation.. A derivation is said to be rightmost if in each step of the derivation, the rightmost variable in the sentential form is replaced. Example: Consider the grammar with productions – S→aAB A→bBb B→A | ϵ Rightmost derivation for the string abbbb is – S => aAB S→aAB => aAA B→A => aAbBb A→bBb => aAbb B→ ϵ => abBbbb A→bBb => abbbb B→ ϵ

Santanu Halder

57 Compiler Design

Derivation Tree Let G=(V, T, P, S) be a CFG. An ordered tree is a derivation tree for G iff it has the following properties: 1. The root is labeled S. 2. Every leaf has a label from T U {ϵ}. 3. Every interior vertex has a label from V. 4. If a vertex has label A (which is in V), and its children are labeled from left to right as a1, a2, …, an, then P must contain a production of the form – A→a1a2…an 5. A leaf labeled ϵ has no siblings.

Santanu Halder

58 Compiler Design

Derivation Tree.. S

a

A

b

B

B

b

ϵ

ϵ Santanu Halder

59 Compiler Design

Simplification of CFG Steps to simplify a grammar – Step 1: Eliminate null production. Step 2: Eliminate unit production. Step 3: Eliminate production that contains useless variables. Useless variables are of two types – 1. Variables which are not deriving terminal string. 2. Variables that can’t be reached from start symbol.

Santanu Halder

60 Compiler Design

Simplification of CFG.. Elimination of null production: Let G={V, T, P, S} and G′={V, T, P′, S} Step 1: Say w1={A ϵ V | A→ ɛ is in P} wi+1=wi U {A ϵ V | there exists a production A→ x with x ϵ wi*} Stop when wi = wi+1 and wi is the set of all null able variables.

Santanu Halder

61 Compiler Design

Simplification of CFG.. Elimination of null production: Let G={V, T, P, S} and G′={V, T, P′, S} Step 1: Say w1={A ϵ V | A→ ɛ is in P} wi+1=wi U {A ϵ V | there exists a production A→ x with x ϵ wi*} Stop when wi = wi+1 and wi is the set of all null able variables. Step 2: a) Any productions, whose RHS does not have any nullable variable, is included in P′. b) If A→X1X2…Xk is in P, the productions of the form A→α1α2…αk are included in P′, where αi = Xi if Xi is not in wi. αi = Xi or ɛ if xi ϵ wi and α1α2…αk ≠ ɛ.

Santanu Halder

62 Compiler Design

Simplification of CFG.. Elimination of null production: Example: G = (V, T, P, S) V = {S, A, B, C, D} T = {a, b, d} P = {S→ABaC, A→BC, B→b | ɛ, C→D | ɛ, D→d}

Santanu Halder

63 Compiler Design

Simplification of CFG.. Elimination of null production: Example: G = (V, T, P, S) V = {S, A, B, C, D} T = {a, b, d} P = {S→ABaC, A→BC, B→b | ɛ, C→D | ɛ, D→d} Solution: w1 = {B, C} Since B→ɛ and C→ɛ

Santanu Halder

64 Compiler Design

Simplification of CFG.. Elimination of null production: Example: G = (V, T, P, S) V = {S, A, B, C, D} T = {a, b, d} P = {S→ABaC, A→BC, B→b | ɛ, C→D | ɛ, D→d} Solution: w1 = {B, C} Since B→ɛ and C→ɛ w2 = w1 U {A} Since A→BC = {A, B, C}

Santanu Halder

65 Compiler Design

Simplification of CFG.. Elimination of null production: Example: G = (V, T, P, S) V = {S, A, B, C, D} T = {a, b, d} P = {S→ABaC, A→BC, B→b | ɛ, C→D | ɛ, D→d} Solution: w1 = {B, C} Since B→ɛ and C→ɛ w2 = w1 U {A} Since A→BC = {A, B, C} w3 = w2 U {Φ} Since no more such productions = {A, B, C}

Santanu Halder

66 Compiler Design

Simplification of CFG.. Elimination of null production: Example: G = (V, T, P, S) V = {S, A, B, C, D} T = {a, b, d} P = {S→ABaC, A→BC, B→b | ɛ, C→D | ɛ, D→d} Solution: Hence – P′ = { S→ABaC|BaC|AaC|ABa|aC|Aa|Ba|a A→BC|C|B B→b C→D D→d}

Santanu Halder

67 Compiler Design

Simplification of CFG.. Elimination of Unit Production: Any production of a CFG of the form A→B where A, B ϵ V is called an unit production.

Santanu Halder

68 Compiler Design

Simplification of CFG.. Elimination of Unit Production: Any production of a CFG of the form A→B where A, B ϵ V is called an unit production. For the following grammar, S→Aa |B B→A|bb A→a|bc|B The unit productions are: S→B B→A A→B The dependency graph for the unit productions is – S

Santanu Halder

A

B

69 Compiler Design

Simplification of CFG.. Elimination of Unit Production: S

A

B

Hence, P′ contains: Original non-unit productions: S→Aa A→a | bc B→bb

Santanu Halder

70 Compiler Design

Simplification of CFG.. Elimination of Unit Production: S

A

B

Hence, P′ contains: Original non-unit productions: S→Aa A→a | bc B→bb The new rules: S→a | bc | bb A→bb B→a | bc Santanu Halder

71 Compiler Design

Simplification of CFG.. Elimination of Unit Production: S

A

B

Hence, P′ contains: Original non-unit productions: S→Aa A→a | bc B→bb The new rules: Altogether, S→a | bc | bb S→Aa | a | bc | bb A→bb A→a | bc | bb B→a | bc B→bb | a | bc Santanu Halder

72 Compiler Design

Simplification of CFG.. Elimination of variables not deriving terminal string: Let G={V, T, P, S} and G′={V′, T, P′, S} Step 1: Say w1={A ϵ V | A→ x is in P where xϵT*} wi+1=wi U {A ϵ V | there exists a production A→ x with x ϵ {T U wi}*} Stop when wi = wi+1 and then V′ = {wi}. Step 2: P′={A→x | A, x ϵ (V′UT)*}

Santanu Halder

73 Compiler Design

Simplification of CFG.. Elimination of variables not deriving terminal string: Example: G = (V, T, P, S) V = {S, A, B, C} T = {a, b, c} and P = { S→AB, A→a, B→b, B→C, E→c }

Santanu Halder

74 Compiler Design

Simplification of CFG.. Elimination of variables not deriving terminal string: Example: G = (V, T, P, S) V = {S, A, B, C} T = {a, b, c} and P = { S→AB, A→a, B→b, B→C, E→c } w1 = {A, B, E} Since A→a, B→b, E→c

Santanu Halder

75 Compiler Design

Simplification of CFG.. Elimination of variables not deriving terminal string: Example: G = (V, T, P, S) V = {S, A, B, C} T = {a, b, c} and P = { S→AB, A→a, B→b, B→C, E→c } w1 = {A, B, E} Since A→a, B→b, E→c w2 = w1 U {S} Since S→AB = {S, A, B, E}

Santanu Halder

76 Compiler Design

Simplification of CFG.. Elimination of variables not deriving terminal string: Example: G = (V, T, P, S) V = {S, A, B, C} T = {a, b, c} and P = { S→AB, A→a, B→b, B→C, E→c } w1 = {A, B, E} Since A→a, B→b, E→c w2 = w1 U {S} Since S→AB = {S, A, B, E} w3 = w2 U {Φ} Since no more such productions. = {S, A, B, E}

Santanu Halder

77 Compiler Design

Simplification of CFG.. Elimination of variables not deriving terminal string: Example: G = (V, T, P, S) V = {S, A, B, C} T = {a, b, c} and P = { S→AB, A→a, B→b, B→C, E→c } w1 = {A, B, E} Since A→a, B→b, E→c w2 = w1 U {S} Since S→AB = {S, A, B, E} w3 = w2 U {Φ} Since no more such productions. = {S, A, B, E} Hence, V′ = {S, A, B, E} So, P′ = { S→AB, A→a, B→b, E→c }

Santanu Halder

78 Compiler Design

Simplification of CFG.. Elimination of variables not reachable from start symbol: Let G={V, T, P, S} and G′={V′, T, P′, S} Step 1: Draw dependency graph for the variables. A variable is useful only if there is a path from the vertex labelled S. V′ = {useful variables} Step 2: P′={A→x | A, x ϵ (V′UT)*}

Santanu Halder

79 Compiler Design

Simplification of CFG.. Elimination of variables not reachable from start symbol: Example: G = (V, T, P, S) V = {S, A, B} T={a} P = { S→aS | A, A→a, B→aa }

Santanu Halder

80 Compiler Design

Simplification of CFG.. Elimination of variables not reachable from start symbol: Example: G = (V, T, P, S) V = {S, A, B} T={a} P = { S→aS | A, A→a, B→aa } The dependency graph for V is:

S

Santanu Halder

A

B

81 Compiler Design

Simplification of CFG.. Elimination of variables not reachable from start symbol: Example: G = (V, T, P, S) V = {S, A, B} T={a} P = { S→aS | A, A→a, B→aa } The dependency graph for V is:

S

A

B

Hence V′ = {S, A} So, P′ = {{ S→aS | A, A→a}

Santanu Halder

82 Compiler Design

Normal Forms for CFG  In a context free grammar, then right hand side of a production rule can be any string of terminals and variables. When the production rules in a grammar satisfy certain restrictions, then the grammar is said to be in a normal form.  The normal forms for CFG are of two types: 1. Chomsky Normal Form (CNF) 2. Greibach Normal Form (GNF)

Santanu Halder

83 Compiler Design

Chomsky Normal Form (CNF) Definition: A context free grammar is in CNF, if all the production rules are of the form: A → BC or, A → a Where A, B, C ϵ V and a ϵ T. Example: S → AS | a A → SA | b

Santanu Halder

84 Compiler Design

Greibach Normal Form (GNF) Definition: A context free grammar is said to be in GNF, if all the production rules are of the form: A → ax Where a ϵ T and x ϵ V*. Example: S → aAB | bBB | bB A → aA | bB | b B→b

Santanu Halder

85 Compiler Design

S - Grammar A context free grammar G = (V, T, P, S) is said to be a simple grammar or S-Grammar if all its productions are of the form – A→aX Where A is in V, a is in T and X is in V*, and any pair (A, a) occurs at most once in P.

Santanu Halder

86 Compiler Design

S - Grammar.. A context free grammar G = (V, T, P, S) is said to be a simple grammar or S-Grammar if all its productions are of the form – A→aX Where A is in V, a is in T and X is in V*, and any pair (A, a) occurs at most once in P. Example: S→aS S→aS S→bSS S→bSS S→c S→aSS S→c

Santanu Halder

87 Compiler Design

S - Grammar.. A context free grammar G = (V, T, P, S) is said to be a simple grammar or S-Grammar if all its productions are of the form – A→aX Where A is in V, a is in T and X is in V*, and any pair (A, a) occurs at most once in P. Example: S→aS S→aS S→bSS S→bSS S→c S→aSS S→c

Santanu Halder

88 Compiler Design

Ambiguous Grammar A context free grammar G = (V, T, P, S) is said to be ambiguous if there exists some w (which is in L(G)) that has at least two distinct derivation trees.

Santanu Halder

89 Compiler Design

Ambiguous Grammar.. Example: E→E+E | E*E | id id + id + id

Santanu Halder

90 Compiler Design

Ambiguous Grammar.. Example: E→E+E | E*E | id id + id + id

E E

+

E

id

E

+

id

Santanu Halder

E id 91 Compiler Design

Ambiguous Grammar.. Example: E→E+E | E*E | id id + id + id

E

E

E

+

E

id

E

+

id

Santanu Halder

E

E

id

id

E

+

E

+

E

id

id 92 Compiler Design

Ambiguous Grammar.. Example: E→E+E | E*E | id id + id * id

Santanu Halder

93 Compiler Design

Ambiguous Grammar.. Example: E→E+E | E*E | id id + id * id

E E

+

E

id

E

*

id

Santanu Halder

E id 94 Compiler Design

Ambiguous Grammar.. Example: E→E+E | E*E | id id + id * id

E

E

E

+

E

id

E

*

id

Santanu Halder

E

E

id

id

E

*

E

+

E

id

id 95 Compiler Design

Ambiguous Grammar.. Solution: Associate precedence rules with the operators. E→E+E | E*E | id Rewrite the production rules as follows to make it unambiguous – E→E+T | T T→T*F | F F→id

Santanu Halder

96 Compiler Design

Ambiguous Grammar.. E

E

E

+

T

E

+

T

+

T

F

T

T

*

T

F

id

F

F

F

id

id

id

E

F id

id

Santanu Halder

97 Compiler Design

Ambiguous Grammar.. Example: Rewrite the following production rules to make the grammar unambiguous. bExp → bExp or bExp → bExp and bExp → not bExp → True | False

Santanu Halder

98 Compiler Design

Ambiguous Grammar.. Example: Rewrite the following production rules to make the grammar unambiguous. bExp → bExp or bExp → bExp and bExp → not bExp → True | False Solution: E → E or T | T T → F and G | G G → not H | True | False

Santanu Halder

99 Compiler Design

Ambiguous Grammar.. Example: Rewrite the following production rules to make the grammar unambiguous. stmt → if expr then stmt stmt → if expr then stmt else stmt stmt → other

Santanu Halder

100 Compiler Design

Ambiguous Grammar.. Example: Rewrite the following production rules to make the grammar unambiguous. stmt → if expr then stmt stmt → if expr then stmt else stmt stmt → other Solution: Idea: A statement appearing between a then and an else must be matched stmt → match_stmt | open_stmt match_stmt → if expr then match_stmt else match_stmt match_stmt → other open_stmt → if expr then stmt open_stmt → if expr then match_stmt else open_stmt Santanu Halder

101 Compiler Design

Right Recursive Grammar A grammar G is said to be right recursive if it contains the following type of production rules: A→αA | β

Santanu Halder

102 Compiler Design

Left Recursive Grammar A grammar G is said to be left recursive if it contains the following type of production rules: A→Aα | β In other words, a grammar G is left recursive if it has a nonterminal A such that there is a derivation A+=> Aα for some string α. Top down parsing methods can not accept left recursive grammar.

Santanu Halder

103 Compiler Design

Elimination of Left Recursion Replace the following production rules: A→Aα | β By A → βA’ A’→ αA’ | ϵ

Santanu Halder

104 Compiler Design

Elimination of Left Recursion.. Replace the following production rules: A→Aα | β By A → βA’ A’→ αA’ | ϵ Example: E→E+T | T

Santanu Halder

105 Compiler Design

Elimination of Left Recursion.. Replace the following production rules: A→Aα | β By A → βA’ A’→ αA’ | ϵ Example: E→E+T | T After elimination of left recursion – E → TE’ E’→ +TE’ | ϵ

Santanu Halder

106 Compiler Design

Elimination of Left Recursion.. Replace the following production rules: A→Aα | β By A → βA’ A’→ αA’ | ϵ Example: S→S0S1S | 01

Santanu Halder

107 Compiler Design

Elimination of Left Recursion.. Replace the following production rules: A→Aα | β By A → βA’ A’→ αA’ | ϵ Example: S→S0S1S | 01 After elimination of left recursion – S → 01S’ S’→ 0S1SS’ | ϵ

Santanu Halder

108 Compiler Design

Elimination of Left Recursion.. Replace the following production rules: A→Aα | β By A → βA’ A’→ αA’ | ϵ Example: S → (L) | x L → L,S | S

Santanu Halder

109 Compiler Design

Elimination of Left Recursion.. Replace the following production rules: A→Aα | β By A → βA’ A’→ αA’ | ϵ Example: S → (L) | x not left recursive L → L,S | S left recursive

Santanu Halder

110 Compiler Design

Elimination of Left Recursion.. Replace the following production rules: A→Aα | β By A → βA’ A’→ αA’ | ϵ Example: S → (L) | x not left recursive L → L,S | S left recursive So, after elimination of left recursion from production rule 2 – L → SL’ L’→ ,SL’ | ϵ

Santanu Halder

111 Compiler Design

Left Factoring  Left factoring is a grammar transformation that is useful for producing a grammar suitable for predictive or top-down parsing.  When the choice between two alternative A-productions is not clear, one may be able to rewrite the productions to defer the decision until enough of the input, has been seen that he/she can make the right choice.  For example – A → αβ1 | αβ2 | … | αβn  As a solution, rewrite the production rules as – A → αA’ A’→ β1 | β2 … | βn

Santanu Halder

112 Compiler Design

Left Factoring.. A → aBcDE A → aBdEC B→b C→c D→d E→e

Santanu Halder

A → aBcDE B → aBdEC B→b C→c D→d E→e

113 Compiler Design

Left Factoring.. A → aBcDE A → aBdEC B→b C→c D→d E→e Non-deterministic grammar

Santanu Halder

A → aBcDE B → aBdEC B→b C→c D→d E→e Deterministic Grammar

114 Compiler Design

Left Factoring.. A → aBcDE A → aBdEC B→b C→c D→d E→e Non-deterministic grammar Apply left factoring

Santanu Halder

A → aBcDE B → aBdEC B→b C→c D→d E→e Deterministic Grammar Does not require left factoring

115 Compiler Design

Left Factoring.. Example: S → iEtS | iEtSeS | a E→b

Santanu Halder

116 Compiler Design

Left Factoring.. Example: S → iEtS | iEtSeS | a E→b Solution: S → iEtSS’ | a S’→ ϵ | eS E→b

Santanu Halder

117 Compiler Design

Left Factoring.. Example: S → aSSbS | aSaSb | abb | b

Santanu Halder

118 Compiler Design

Left Factoring.. Example: S → aSSbS | aSaSb | abb | b Solution: Step 1: S → aS’ | b S’→ SSbS | SaSb | bb

Santanu Halder

119 Compiler Design

Left Factoring.. Example: S → aSSbS | aSaSb | abb | b Solution: Step 1: S → aS’ | b S’→ SSbS | SaSb | bb Step 2: S’→ SS’’ | bb S’’→ SbS | aSb

Santanu Halder

120 Compiler Design

Left Factoring.. Example: S → aSSbS | aSaSb | abb | b Solution: Step 1: S → aS’ | b S’→ SSbS | SaSb | bb Step 2: S’→ SS’’ | bb S’’→ SbS | aSb Altogether: S → aS’ | b S’→ SS’’ | bb S’’→ SbS | aSb Santanu Halder

121 Compiler Design

Closure Properties of CFL and RL 1. If L1 be a context free language and L2 be a regular language, then L1∩L2 is a context free language. 2. Regular sets are closed under transpose. 3. Regular sets are closed under complementation. 4. Regular sets are closed under intersection. 5. Context free languages are closed under union. 6. Context free languages are closed under concatenation. 7. Context free languages are closed under star closure. 8. Context free languages are not closed under intersection. 9. Context free languages are not closed under complementation.

Santanu Halder

122 Compiler Design

Type of Parser Parsers Top Down Parsers TDP with full Backtracking Brute Force Method

Santanu Halder

Bottom Up Parsers TDP without Backtracking

Recursive Descent

Non Recursive Descent (LL(1))

123 Compiler Design

Type of Parser.. Parsers Top Down Parsers

Bottom Up Parsers

Operator Precedence Parser

LR Parser LR(0)

Santanu Halder

SLR(1)

LALR(1)

CLR(1)

124 Compiler Design

Difference between TDP and BUP S → aABe A → Abc | b B→d

Santanu Halder

w = abbcde

125 Compiler Design

Difference between TDP and BUP S → aABe A → Abc | b B→d

w = abbcde

Top down parser S a

A

Santanu Halder

B

e

126 Compiler Design

Difference between TDP and BUP S → aABe A → Abc | b B→d

w = abbcde

Top down parser S a

A A

b

B

e

c

Santanu Halder

127 Compiler Design

Difference between TDP and BUP S → aABe A → Abc | b B→d

w = abbcde

Top down parser S a

A A

b

B

e

c

b

Santanu Halder

128 Compiler Design

Difference between TDP and BUP S → aABe A → Abc | b B→d

w = abbcde

Top down parser S a

A A

b

B c

e

d

b

Santanu Halder

129 Compiler Design

Difference between TDP and BUP S → aABe A → Abc | b B→d

w = abbcde

Top down parser

Bottom up parser

S a

A A

b

B c

b

Santanu Halder

e

d a

b

b

c

d

e

130 Compiler Design

Difference between TDP and BUP S → aABe A → Abc | b B→d

w = abbcde

Top down parser

Bottom up parser

S a

A A

b

B c

b

Santanu Halder

e

d

A a

b

b

c

d

e

131 Compiler Design

Difference between TDP and BUP S → aABe A → Abc | b B→d

w = abbcde

Top down parser

Bottom up parser

S a

A A

b

B c

b

Santanu Halder

e

A

d

A a

b

b

c

d

e

132 Compiler Design

Difference between TDP and BUP S → aABe A → Abc | b B→d

w = abbcde

Top down parser

Bottom up parser

S a

A A

b

B c

b

Santanu Halder

e

A

d

B

A a

b

b

c

d

e

133 Compiler Design

Difference between TDP and BUP S → aABe A → Abc | b B→d

w = abbcde

Top down parser

Bottom up parser

S a

S

A A

b

B c

b

Santanu Halder

e

A

d

B

A a

b

b

c

d

e

134 Compiler Design

Difference between TDP and BUP S → aABe A → Abc | b B→d

w = abbcde

Top down parser

Bottom up parser

(Left most derivation)

(Right most derivation)

Santanu Halder

135 Compiler Design

Difference between TDP and BUP S → aABe A → Abc | b B→d

w = abbcde

Top down parser

Bottom up parser

(Left most derivation) S => aABe

(Right most derivation) S => aABe

Santanu Halder

136 Compiler Design

Difference between TDP and BUP S → aABe A → Abc | b B→d

w = abbcde

Top down parser

Bottom up parser

(Left most derivation) S => aABe => aAbcBe

(Right most derivation) S => aABe => aAde

Santanu Halder

137 Compiler Design

Difference between TDP and BUP S → aABe A → Abc | b B→d

w = abbcde

Top down parser

Bottom up parser

(Left most derivation) S => aABe => aAbcBe => abbcBe

(Right most derivation) S => aABe => aAde => aAbcde

Santanu Halder

138 Compiler Design

Difference between TDP and BUP S → aABe A → Abc | b B→d

w = abbcde

Top down parser

Bottom up parser

(Left most derivation)

(Reverse of Right most derivation)

S => aABe => aAbcBe => abbcBe => abbcde

Santanu Halder

S => aABe => aAde => aAbcde => abbcde

139 Compiler Design

Difference between TDP and BUP S → aABe A → Abc | b B→d

w = abbcde

Top down parser

Bottom up parser

(Left most derivation)

(Reverse of Right most derivation)

S => aABe => aAbcBe => abbcBe => abbcde

Santanu Halder

S => aABe => aAde => aAbcde => abbcde

140 Compiler Design

Recursive Descent Parser  Consists of a set of procedures, one for each nonterminal.  Execution begins with the procedure for start symbol.  Recursive descent parsers cant be used for left-recursive grammars.

Santanu Halder

141 Compiler Design

Recursive Descent Parser.. E → iE’ E’→ +iE’ | ε

Santanu Halder

142 Compiler Design

Recursive Descent Parser.. E → iE’ E’→ +iE’ | ε E() { if(l == 'i') { match('i'); E’(); } }

Santanu Halder

143 Compiler Design

Recursive Descent Parser.. E → iE’ E’→ +iE’ | ε E() { if(l == 'i') { match('i'); E’(); } }

Santanu Halder

E’() { if(l == '+') { match(+); match(i); E’(); } else return(); } 144 Compiler Design

Recursive Descent Parser.. E → iE’ E’→ +iE’ | ε E() { if(l == 'i') { match('i'); E’(); } }

Santanu Halder

E’() { if(l == '+') { match('+'); match('i'); E’(); } else return(); }

match(char t) { if(l == t) l = getchar(); else printf(“error”); }

145 Compiler Design

Recursive Descent Parser.. E → iE’ E’→ +iE’ | ε E() { if(l == 'i') { match('i'); E’(); } }

Santanu Halder

E’() { if(l == '+') { match('+'); match('i'); E’(); } else return(); }

match(char t) { if(l == t) l = getchar(); else printf(“error”); } main() { E(); if(l=='$') printf(“accept”); } 146 Compiler Design

FIRST and FOLLOW To compute FIRST(X) for all symbols X in grammar G, apply following rules until no more terminals or ɛ can be added to any FIRST set: 1. If X is a terminal, then FIRST(X) = {X}. 2. If X is a non-terminal and X→Y1Y2…Yk is a production for some k≥1, then place a in FIRST(X) if for some i, a is in FIRST(Yi) and ɛ is in all of FIRST(Y1),…,FIRST(Yi-1). If ɛ is in FIRST(Yj) for j=1…k then add ɛ to FIRST(X). 3.If X→ ɛ is a production, then add ɛ to FIRST(X).

Santanu Halder

147 Compiler Design

FIRST and FOLLOW.. To compute FOLLOW(A) for all non-terminals A, apply the following rules until nothing can be added to any FOLLOW set. 1. Place $ in FOLLOW(S) , where S is the start symbol, and $ is the input right end marker . 2. If there is a production A → αBβ, then everything in FIRST(β) except ɛ is in FOLLOW(B). 3. If there is a production A → αB, or a production A →αBβ, where FIRST(β) contains ɛ, then everything in FOLLOW(A) is in FOLLOW(B) .

Santanu Halder

148 Compiler Design

FIRST and FOLLOW.. Example: Find FIRST and FOLLOW for the following grammar. S→ABCDE A→a | ɛ B→b | ɛ C→c D→d | ɛ E→e | ɛ

Santanu Halder

149 Compiler Design

FIRST and FOLLOW.. Solution: S→ABCDE A→a | ɛ B→b | ɛ C→c D→d | ɛ E→e | ɛ

Santanu Halder

S A B C D E

FIRST {a, b, c} {a, ɛ} {b, ɛ} {c} {d, ɛ} {e, ɛ}

FOLLOW {$} {b, c} {c} {d, e, $} {e, $} {$}

150 Compiler Design

FIRST and FOLLOW.. Example: Find FIRST and FOLLOW for the following grammar. S→Bb | Cd B→aB | ɛ C→cC | ɛ

Santanu Halder

151 Compiler Design

FIRST and FOLLOW.. Solution: S→Bb | Cd B→aB | ɛ C→cC | ɛ

Santanu Halder

FIRST S {a, b, c, d} B {a, ɛ} C {c, ɛ}

152 Compiler Design

FIRST and FOLLOW.. Solution: S→Bb | Cd B→aB | ɛ C→cC | ɛ

Santanu Halder

FIRST S {a, b, c, d} B {a, ɛ} C {c, ɛ}

FOLLOW {$} {b} {d}

153 Compiler Design

FIRST and FOLLOW.. Example: Find FIRST and FOLLOW for the following grammar. E → TE’ E’→ +TE’ | ɛ T → FT’ T’→ *FT’ | ɛ F → id | (E)

Santanu Halder

154 Compiler Design

FIRST and FOLLOW.. Solution: E → TE’ E’→ +TE’ | ɛ T → FT’ T’→ *FT’ | ɛ F → id | (E)

Santanu Halder

E E’ T T’ F

FIRST {id, (} {+, ɛ} {id, (} {*, ɛ} {id, (}

155 Compiler Design

FIRST and FOLLOW.. Solution: E → TE’ E’→ +TE’ | ɛ T → FT’ T’→ *FT’ | ɛ F → id | (E)

Santanu Halder

E E’ T T’ F

FIRST {id, (} {+, ɛ} {id, (} {*, ɛ} {id, (}

FOLLOW {$, )} {$, )} {+, $, )} {+, $, )} {*, +, $, )}

156 Compiler Design

FIRST and FOLLOW.. Example: Find FIRST and FOLLOW for the following grammar. S → ACB | CbB | Ba A → da | BC B→g|ɛ C→h|ɛ

Santanu Halder

157 Compiler Design

FIRST and FOLLOW.. Solution: S → ACB | CbB | Ba A → da | BC │ ε B→g|ɛ C→h|ɛ

Santanu Halder

S A B C

FIRST {d, g, h, ɛ, b, a} {d, g, h, ɛ} {g, ɛ} {h, ɛ}

158 Compiler Design

FIRST and FOLLOW.. Solution: S → ACB | CbB | Ba A → da | BC B→g|ɛ C→h|ɛ

Santanu Halder

S A B C

FIRST {d, g, h, ɛ, b, a} {d, g, h, ɛ} {g, ɛ} {h, ɛ}

FOLLOW {$} {h, g, $} {$, a, h, g} {g, $, b, h}

159 Compiler Design

LL(1) Grammar Input buffer LL(1)

$

LL(1) parser $ Stack

Scan input from Left to right

Left most deriva tion

Size of look ahead

LL(1) parsing table

Santanu Halder

160 Compiler Design

LL(1) Grammar An unambiguous, non left recursive grammar G is LL(l) if and only if whenever A →α | β are two distinct productions of G, the following conditions hold: 1. For no terminal a, do both α and β derive strings beginning with a. 2. At most one of α and β can derive the empty string. 3. If , then α does not derive any string beginning with a terminal in FOLLOW(A) . Likewise, if then β does not derive any string beginning with a terminal in FOLLOW(A).

Santanu Halder

161 Compiler Design

Parser table for LL(1) Method: For each production A→α in the grammar, do the following to construct a parser table – 1.For each terminal a in FIRST(A), add A→α to M[A, a]. 2.If ε is in FIRST(α), then for each terminal b in FOLLOW(A), add A→α to M[A, b]. If ε is in FIRST(α), and $ is in FOLLOW(A), then add A→α to M[A, $] as well.

Santanu Halder

162 Compiler Design

Parser table for LL(1).. E → TE’ E’→ +TE’ | ɛ T → FT’ T’→ *FT’ | ɛ F → id | (E)

Santanu Halder

E E’ T T’ F

FIRST {id, (} {+, ɛ} {id, (} {*, ɛ} {id, (}

FOLLOW {$, )} {$, )} {+, $, )} {+, $, )} {*, +, $, )}

163 Compiler Design

Parser table for LL(1).. E → TE’ E’→ +TE’ | ɛ T → FT’ T’→ *FT’ | ɛ F → id | (E) id E

FOLLOW {$, )} {$, )} {+, $, )} {+, $, )} {*, +, $, )} ( )

$

E→TE’ E’→+TE’

T→FT’

T’ F

+

E→TE’

E’ T

E E’ T T’ F

FIRST {id, (} {+, ɛ} {id, (} {*, ɛ} {id, (} *

E’→ε

E’→ε

T’→ ε

T’→ ε

T→FT’ T’→ε

F→id

Santanu Halder

T’→*FT’ F→(E)

164 Compiler Design

Parser table for LL(1).. S → (S) | ɛ

Santanu Halder

FIRST S {(, ɛ}

FOLLOW {$, )}

165 Compiler Design

Parser table for LL(1).. S → (S) | ɛ

FIRST S {(, ɛ}

( ) S S→(S) S→ε

Santanu Halder

FOLLOW {$, )}

$ S→ε

166 Compiler Design

Table driven LL(1) parsing Algo INPUT: A string w and a parsing table M for grammar G. OUTPUT: If w is in L(G) , a leftmost derivation of w; otherwise, an error indication. METHOD: set ip to point to the first symbol of w; set X to the top stack symbol; While(X≠$) { // a is current symbol of w. if ( X is a ) {pop the stack and advance ip;} elseif ( X is a terminal ) error(); elseif ( M [X, a] is an error entry ) {error() ;} elseif ( M[X, a] = X→Y1Y2…Yk) { output the production X →Y1Y2…Yk; pop the stack; push Yk, Yk -l, ... , Y1 onto the stack, with Y1 on top; } set X to the top stack symbol; }

Santanu Halder

167 Compiler Design

Parsing for LL(1) S → (S) | ɛ

FIRST S {(, ɛ}

( ) S S→(S) S→ε

FOLLOW {$, )}

$ S→ε

Show the sequence of moves to parse the string (()) using the above parse table.

Santanu Halder

168 Compiler Design

Parsing for LL(1).. ( ) S S→(S) S→ε Matched

Stack S$

Santanu Halder

w= (()) Input (())$

$ S→ε Action

169 Compiler Design

Parsing for LL(1).. ( ) S S→(S) S→ε Matched

Stack S$ (S)$

Santanu Halder

w= (()) Input (())$ (())$

$ S→ε Action output S→(S)

170 Compiler Design

Parsing for LL(1).. ( ) S S→(S) S→ε Matched (

Stack S$ (S)$ S)$

Santanu Halder

w= (()) Input (())$ (())$ ())$

$ S→ε Action output S→(S) match '('

171 Compiler Design

Parsing for LL(1).. ( ) S S→(S) S→ε Matched ( (

Stack S$ (S)$ S)$ (S))$

Santanu Halder

w= (()) Input (())$ (())$ ())$ ())$

$ S→ε Action output S→(S) match '(' output S→(S)

172 Compiler Design

Parsing for LL(1).. ( ) S S→(S) S→ε Matched ( ( ((

Stack S$ (S)$ S)$ (S))$ S))$

Santanu Halder

w= (()) Input (())$ (())$ ())$ ())$ ))$

$ S→ε Action output S→(S) match '(' output S→(S) match '('

173 Compiler Design

Parsing for LL(1).. ( ) S S→(S) S→ε Matched ( ( (( ((

Stack S$ (S)$ S)$ (S))$ S))$ ))$

Santanu Halder

w= (()) Input (())$ (())$ ())$ ())$ ))$ ))$

$ S→ε Action output S→(S) match '(' output S→(S) match '(‘ output S→ε

174 Compiler Design

Parsing for LL(1).. ( ) S S→(S) S→ε Matched ( ( (( (( (()

Stack S$ (S)$ S)$ (S))$ S))$ ))$ )$

Santanu Halder

w= (()) Input (())$ (())$ ())$ ())$ ))$ ))$ )$

$ S→ε Action output S→(S) match '(' output S→(S) match '(‘ output S→ε match ')'

175 Compiler Design

Parsing for LL(1).. ( ) S S→(S) S→ε Matched ( ( (( (( (() (())

Stack S$ (S)$ S)$ (S))$ S))$ ))$ )$ $

Santanu Halder

w= (()) Input (())$ (())$ ())$ ())$ ))$ ))$ )$ $

$ S→ε Action output S→(S) match '(' output S→(S) match '(' output S→ε match ')' match ')' 176 Compiler Design

Shift Reduce Parser  The general idea is to shift some symbols of input to the stack until a reduction can be applied.  At each reduction step, a specific substring matching the body of a production is replaced by the non-terminal at the head of the production.  The key decisions during bottom-up parsing are about when to reduce and about what production to apply.  A reduction is a reverse of a step in a derivation.  The goal of a bottom-up parser is to construct a derivation in reverse: E=>T=>T*F=>T*id=>F*id=>id*id

Santanu Halder

177 Compiler Design

Handle Pruning A Handle is a substring that matches the body of a production and whose reduction represents one step along the reverse of a rightmost derivation. Example: Right Sentential Form id*id F*id T*id T*F

Santanu Halder

Handle id F id T*F

Reducing Production F→id T→F F→id E→T*F

178 Compiler Design

Shift Reduce Parsing  A stack is used to hold grammar symbols.  Handle always appear on top of the stack.  Initial configuration: Stack Input $ w$  Acceptance configuration: Stack Input $S $

Santanu Halder

179 Compiler Design

Operator Precedence Grammar  An operator precedence grammar is a context-free grammar that has the property that no production has either an empty right-hand side or two adjacent nonterminals in its right-hand side. Or a CFG which is used to define the mathematical operators is called an operator precedence grammar. Example: E→E+E | E*E | id  Rely on the following three precedence relations between terminals: Relation Meaning a b a takes precedence over b Santanu Halder

180 Compiler Design

Operator Precedence Parser  An operator precedence parser is a bottom-up parser that interprets an operator-precedence grammar.  Able to parse an ambiguous grammar.  Simple enough to write by hand.  Can be written to consult an operator table at run time.  Not used often in practice.

Santanu Halder

181 Compiler Design

Operator Precedence Parser..  E→E+E | E*E | id  Terminals are +, *, id, $.  Operator precedence relation table: 1. id has the highest precedence. 2. $ has the lowest precedence. 3. Both + and * are left-associative (hence +·>+ and *·>*) 4. Need not compare between id and id because of two adjacent non-terminals are not present in production rules.

Santanu Halder

182 Compiler Design

Operator Precedence Parser.. E→E+E | E*E | id

id id +

·> ·> 183 Compiler Design

Operator Precedence Parser.. Algorithm for parsing Initialize: Set ip to point to the first symbol of w$. Repeat: If $ is on the top of the stack and ip points to $ then return else Let a be the top terminal on the stack, and b the symbol pointed to by ip; if a b then repeat pop the stack; until (the top stack terminal is related by ·> ·> +

ip

$

+

a id

+

id

*

*

·> ·>

$

·> +

ip

id

+

*

·> ·>

$

·>

$

·> +

ip

id

+

a

*

·> ·>

$

·>

$

·> +

ip id

+

a E

*

·> ·>

$

·>

$

·>

$

·>

$

+

E

ip

+

id

*

·> ·>

$

b) then begin pop 2*|b| symbols off the stack; let s' be the state on top of the stack; push A then goto[s', A] on top of the stack; output the production A->b; end else if (action[s, a] = accept) then return else error(); end

Santanu Halder

212 Compiler Design

LR(0) Parsing Line Stack 1

0

Input aabb$

Action

Action States 0

a

b

s3

s4

1

Santanu Halder

Goto $

A

S

2

1

acc ept

2

s3

s4

5

3

s3

s4

6

4

r3

r3

r3

5

r1

r1

r1

6

r2

r2

r2

213 Compiler Design

LR(0) Parsing Line Stack

Input

Action

Action

1

0

aabb$ Shift to 3

2

0a3

abb$

States 0

a

b

s3

s4

1

Santanu Halder

Goto $

A

S

2

1

acc ept

2

s3

s4

5

3

s3

s4

6

4

r3

r3

r3

5

r1

r1

r1

6

r2

r2

r2

214 Compiler Design

LR(0) Parsing Line Stack

Input

Action

Action

1

0

aabb$ Shift to 3

2

0a3

abb$

3

0a3a3

bb$

Shift to 3

States 0

a

b

s3

s4

1

Santanu Halder

Goto $

A

S

2

1

acc ept

2

s3

s4

5

3

s3

s4

6

4

r3

r3

r3

5

r1

r1

r1

6

r2

r2

r2

215 Compiler Design

LR(0) Parsing Line Stack

Input

1

0

aabb$ Shift to 3

2

0a3

abb$

Shift to 3

3

0a3a3

bb$

Shift to 4

4

0a3a3b4

b$

Santanu Halder

Action

Action States 0

a

b

s3

s4

1

Goto $

A

S

2

1

acc ept

2

s3

s4

5

3

s3

s4

6

4

r3

r3

r3

5

r1

r1

r1

6

r2

r2

r2

216 Compiler Design

LR(0) Parsing Line Stack

Input

1

0

aabb$ Shift to 3

2

0a3

abb$

Shift to 3

3

0a3a3

bb$

Shift to 4

4

0a3a3b4

b$

Reduce by A→b

5

0a3a3A6

b$

Reduce by A→b means pop 2*|b|=2 symbols from stack which is here 4 and b. Then push A onto the stack and as the previous symbol is 3 in the stack, then perform goto(3,A) which gives 6. Also push 6 onto the stack and don’t forward the input pointer.

Santanu Halder

Action

Action States 0

a

b

s3

s4

1

Goto $

A

S

2

1

acc ept

2

s3

s4

5

3

s3

s4

6

4

r3

r3

r3

5

r1

r1

r1

6

r2

r2

r2

217 Compiler Design

LR(0) Parsing Line Stack

Input

1

0

aabb$ Shift to 3

2

0a3

abb$

Shift to 3

3

0a3a3

bb$

Shift to 4

4

0a3a3b4

b$

Reduce by A→b

5

0a3a3A6

b$

Reduce by A→aA

6

0a3A6

b$

Santanu Halder

Action

Action States 0

a

b

s3

s4

1

Goto $

A

S

2

1

acc ept

2

s3

s4

5

3

s3

s4

6

4

r3

r3

r3

5

r1

r1

r1

6

r2

r2

r2

218 Compiler Design

LR(0) Parsing Line Stack

Input

1

0

aabb$ Shift to 3

2

0a3

abb$

Shift to 3

3

0a3a3

bb$

Shift to 4

4

0a3a3b4

b$

Reduce by A→b

5

0a3a3A6

b$

Reduce by A→aA

6

0a3A6

b$

Reduce by A→aA

7

0A2

b$

Santanu Halder

Action

Action States 0

a

b

s3

s4

1

Goto $

A

S

2

1

acc ept

2

s3

s4

5

3

s3

s4

6

4

r3

r3

r3

5

r1

r1

r1

6

r2

r2

r2

219 Compiler Design

LR(0) Parsing Line Stack

Input

1

0

aabb$ Shift to 3

2

0a3

abb$

Shift to 3

3

0a3a3

bb$

Shift to 4

4

0a3a3b4

b$

Reduce by A→b

5

0a3a3A6

b$

Reduce by A→aA

6

0a3A6

b$

Reduce by A→aA

7

0A2

b$

Shift to 4

8

0A2b4

$

Santanu Halder

Action

Action States 0

a

b

s3

s4

1

Goto $

A

S

2

1

acc ept

2

s3

s4

5

3

s3

s4

6

4

r3

r3

r3

5

r1

r1

r1

6

r2

r2

r2

220 Compiler Design

LR(0) Parsing Line Stack

Input

1

0

aabb$ Shift to 3

2

0a3

abb$

Shift to 3

3

0a3a3

bb$

Shift to 4

4

0a3a3b4

b$

Reduce by A→b

5

0a3a3A6

b$

Reduce by A→aA

6

0a3A6

b$

Reduce by A→aA

7

0A2

b$

Shift to 4

8

0A2b4

$

Reduce by A→b

9

0A2A5

$

Santanu Halder

Action

Action States 0

a

b

s3

s4

1

Goto $

A

S

2

1

acc ept

2

s3

s4

5

3

s3

s4

6

4

r3

r3

r3

5

r1

r1

r1

6

r2

r2

r2

221 Compiler Design

LR(0) Parsing Line Stack

Input

1

0

aabb$ Shift to 3

2

0a3

abb$

Shift to 3

3

0a3a3

bb$

Shift to 4

4

0a3a3b4

b$

Reduce by A→b

5

0a3a3A6

b$

Reduce by A→aA

6

0a3A6

b$

Reduce by A→aA

7

0A2

b$

Shift to 4

8

0A2b4

$

9

0A2A5

$

10

0S1

$

Santanu Halder

Action

Action States 0

a

b

s3

s4

1

Goto $

A

S

2

1

acc ept

2

s3

s4

5

3

s3

s4

6

Reduce by A→b

4

r3

r3

r3

Reduce by S→AA

5

r1

r1

r1

6

r2

r2

r2

222 Compiler Design

LR(0) Parsing Line Stack

Input

Action

Action

1

0

aabb$ Shift to 3

2

0a3

abb$

Shift to 3

3

0a3a3

bb$

Shift to 4

4

0a3a3b4

b$

Reduce by A→b

5

0a3a3A6

b$

Reduce by A→aA

6

0a3A6

b$

Reduce by A→aA

7

0A2

b$

Shift to 4

8

0A2b4

$

9

0A2A5

10

0S1

States 0

a

b

s3

s4

1

Goto $

A

S

2

1

acc ept

2

s3

s4

5

3

s3

s4

6

Reduce by A→b

4

r3

r3

r3

$

Reduce by S→AA

5

r1

r1

r1

$

Accept

6

r2

r2

r2

Santanu Halder

223 Compiler Design

SLR(1) Parsing Table 1. Construct C = {I0 , I1 , . . . , In } , the collection of sets of LR(0) items for the augmented grammar G' . 2. State i is constructed from Ii . The parsing actions for state i are determined as follows: a) If [A →α.Aβ] is in Ii and GOTO (Ii, a) = Ij , then set ACTION[i, a] to "shift j“. Here a must be a terminal. b) If [A→α.] is in Ii , then set ACTION[i, a] to "reduce A→α" for all a in FOLLOW(A); here A may not be S' . c) If [S'→S·] is in Ii , then set ACTION[i, $] to "accept“. If any conflicting actions result from the above rules, we say the grammar is not SLR(1) . The algorithm fails to produce a parser in this case. 3. The goto transitions for state i are constructed for all nonterminals A using the rule: If GOTO (Ii, A) = Ij , then GOTo[i, A] = j . 4. All entries not defined by rules (2) and (3) are made "error″. 5. The initial state of the parser is the one constructed from the set of items containing [S'→.S] .

Santanu Halder

224 Compiler Design

SLR(1) Parsing Table.. S'→ S S → AA A → aA A→b

1 2 3

States 0

a s3

Action b s4

1 First Follow S {a,b} {$} A {a,b} {a,b,$}

2

2

s3

s4

5

3

s3

s4

6

4

r3

r3

6

1

accept

5

Santanu Halder

$

Goto A S

r3 r1

r2

r2

r2 225 Compiler Design

SLR(1) Grammar Every SLR(1) grammar is unambiguous, but there are many unambiguous grammars that are not SLR(1). Consider the grammar with productions: S → L=R | R L → *R | id R→L

Santanu Halder

226 Compiler Design

Sets-of-LR(1)-items construction

Santanu Halder

227 Compiler Design

Sets-of-LR(1)-items construction..

Santanu Halder

228 Compiler Design

Sets-of-LR(1)-items construction.. Example: S → AA A → aA | b

Santanu Halder

229 Compiler Design

Sets-of-LR(1)-items construction.. Example: S → AA A → aA | b Add one more production rule: So, now the grammar is: S'→ S S → AA A → aA | b

Santanu Halder

S' → S

230 Compiler Design

Sets-of-LR(1)-items construction.. I5

I1 S

I0

S'→S.,$

I2

S'→.S,$ A S→A.A,$ S→.AA,$ A→.aA,$ A→.aA,a/b A→.b,$ A→.b,a/b a

a b

I3

A→a.A,a/b A→.aA,a/b A→.b,a/b

S→AA.,$

A

I6 A

I9 A→aA.,$

A→a.A,$ a A→.aA,$ A→.b,$

b

a

I7 b A→b.,$

A

I8 A→aA.,a/b

I4 b A→b.,a/b

Santanu Halder

231 Compiler Design

Sets-of-LR(1)-items construction.. The I9 state S'→S.,$ S→AA.,$ S I0 A I2 I6 A A→aA.,$ pairs (I4, I7), (I8, S'→.S,$ S→A.A,$ A→a.A,$ A a I9) and a A→.aA,$ S→.AA,$ A→.aA,$ (I3,I6) A→.aA,a/b A→.b,$ A→.b,$ have the A→.b,a/b b I7 b a I3 same A→b.,$ a A→a.A,a/b LR(0) I A A→.aA,a/b b 8 with A→.b,a/b A→aA.,a/b different I4 b Lookahead A→b.,a/b I1

Santanu Halder

I5

232 Compiler Design

Canonical LR(1) Parsing Table

Santanu Halder

233 Compiler Design

Canonical LR(1) Parsing Table.. States 0 1 2 3 4 5 6 7 8 9

a s3

Action b s4

Goto $

A 2

S 1

accept s6 s3 r3

s7 s4 r3

5 8 r1

s6

s7

9 r3

r2

Santanu Halder

r2 r2 234 Compiler Design

Canonical LR(1) Parsing Table.. States 0 1 2 3 4 5 6 7 8 9

a s3

Action b s4

Goto $

A 2

accept s6 s3 r3

s7 s4 r3

5 8 r1

s6

s7

9 r3

r2

Santanu Halder

r2 r2

S 1

As the state pairs (I4, I7), (I8, I9) and (I3, I6) have the same LR(0), merge the rows in the parse table and get the new parse table as LALR(1) parsing table 235 Compiler Design

LALR(1) Parsing Table States 0 1 2 36 47 5 89

a s36

Action b s47

Goto $

A 2

accept s36 s36 r3

s47 s47 r3

r2

r2

Santanu Halder

5 89

S 1

Renaming State 3 and 6 as (36), State 4 and 7 as (47), State 8 and 9 as (89).

r3 r1 r2

236 Compiler Design

End of Unit III

Santanu Halder

237 Compiler Design