Introduction to compiling
Architecture and context of a compiler. Analysis of the source program: lexical analysis, syntax analysis, semantic analysis. Concrete and abstract syntax trees. Compiler phases. Symbol table. Transformation of the program internal representation. Bootstrapping and porting of a compiler.
Lexical analysis
Architecture and role of a lexical analyzer. Symbols, lexical strings, and patterns. Key words and reserved words. Lexical attributes. Deterministic and nondeterministic finite automata. Transformation of regular expressions into nondeterministic automata. Transformation of nondeterministic automata into deterministic automata. Lex, a generator of lexical analyzers.
Syntax methods
Architecture and role of a syntax analyzer. BNF notation. Derivations and syntax trees. Ambiguous grammars. Direct and indirect left recursion. Removal of left recursion. Left factorization. Relation between grammars and regular expressions. Table of operators and grammar of expressions.
Top-down parsing
Classification. Recursive-descent parsing. LL(1) parsing. Construction of the parsing table. Top-down generation of the abstract syntax tree.
Bottom-up parsing
Classification. LR(0) parsing. SLR(1) parsing. Rules for ambiguity resolution in parsing conflicts. LR(1) parsing. LALR(1) parsing. Yacc, a generator of bottom-up parsers. Bottom-up generation of the abstract syntax tree.
Semantic analysis
Architecture and role of a semantic analyzer. Formalisms for static semantic specification. Attribute grammars. Syntax-directed semantics. Algorithms for the computation of semantic attributes. Dependency graphs and order of evaluation. Circular and noncircular attribute grammars. Synthesized and inherited attributes. Mixed attributes. Attributes stored within activation records of functions. Type checking: structural equivalence and equivalence based on names.
Intermediate code generation
Intermediate representation. Three-address code and P-code. Intermediate code as a synthesized attribute. Practical code-generation. Code generation for references to data structures, array manipulation, records, pointers, control statements, logical expressions (possibly evaluated in short circuit), and subprograms.
Runtime environments
Memory organization. Activation record. Calling sequence. Fully static environments and stack-based environments.