CD Chapter 1
CD Chapter 1
• specification of tokens
• recognition of tokens
• hand-written lexical analyzers
• LEX, examples of LEX programs.
Complier
•A compiler acts as a translator, transforming human-oriented programming languages into computer
oriented machine languages.
•Ignore machine-dependent details for programmer
The structure of a compiler
Any compiler must perform two major tasks
• RE ( Regular expression )
• NFA ( Non-deterministic Finite Automata )
• DFA ( Deterministic Finite Automata )
• LEX
The structure of a compiler
2- Parser
Given a formal syntax specification (typically as a context-free grammar [CFG] ), the parse
reads tokens and groups them into units as specified by the productions of the CFG being
used.
As syntactic structure is recognized, the parser either calls corresponding semantic
routines
directly or builds a syntax tree.
• CFG ( Context-Free Grammar )
• BNF ( Backus-Naur Form )
• GAA ( Grammar Analysis Algorithms )
• LL, LR, SLR, LALR Parsers
• YACC
The structure of a compiler
3 - Semantic Routines
Perform two functions
• Check the static semantics of each construct
• Do the actual translation
The heart of a compiler
A pattern is a description of the form that the lexemes of a token may take
A lexeme is a sequence of characters in the source program that matches the pattern for a token
Lexical analysis - The role of a lexical analyzer
Example
Lexical analysis - The role of a lexical analyzer
E = M * C ** 2
<id, pointer to symbol table entry for E>
<assign-op>
<id, pointer to symbol table entry for M>
<mult-op>
<id, pointer to symbol table entry for C>
<exp-op>
<number, integer value 2>
Lexical analysis - The role of a lexical analyzer
Lexical errors:
A character sequence which is not possible to scan into any valid token is a lexical error. Important facts about
the lexical error
• Misspelling of identifiers, operators, keyword are considered as lexical errors
• Generally, a lexical error is caused by the appearance of some illegal character, mostly at the beginning of a
token.
e.g: int @a1;
Above statement has lexical error because identifier (@a1) can’t start with special character
Lexicalanalysis
Lexical analysis--The
Therole
roleof
ofaalexical
lexicalanalyzer
analyzer
• In theory of compilation regular expressions are used to formalize the specification of tokens
• Regular expressions are means for specifying regular languages
Example:
letter_(letter_ | digit)*
• Each regular expression is a pattern specifying the form of
strings
Recognition of tokens
%%
... rules section...
%%
... user subroutines ...
www.paruluniversity.ac.in