Daniela da Cruz Pedro Rangel Henriques Universidade do Minho Braga - Portugal
description
Transcript of Daniela da Cruz Pedro Rangel Henriques Universidade do Minho Braga - Portugal
set.2005 gEPL / DI-UM 1
Daniela da Cruz
Pedro Rangel Henriques
Universidade do Minho
Braga - Portugal
set.2005 gEPL / DI-UM 2
Project Context
This project emerged in the context of another project: CoCo/RF
• Partners: Universidade da Beira Interior (UBI), University of Linz, and Universidade do Minho (UM)
• Aim: to port the Compiler Generator CoCo/R (developped at ULinz) to OCaml (more precisely, to F#)
set.2005 gEPL / DI-UM 3
Project Aims
• To explore as much as possible one of the present implementations of CoCo/R (we choose C# version), to understand the generator, the generated compiler, and the CoCoL specifications.
• To develop a complete compiler for an imperative and structured programming language.
set.2005 gEPL / DI-UM 4
Compiler
Translates a Source Program, written in LISS, into Assembly code of the Target Machine
• First Version (simple LISS spec)– MSP (very simple virtual stack machine)
• Second Version (full LISS spec)– VM (powerful virtual stack machine)
set.2005 gEPL / DI-UM 5
Compiler
• Top-Down parser – Pure Recursive-Descent– Solving LL() Conflits with a lookahead(n) strategy
• Syntax-Directed Translator (static / dynamic semantic
rules executed during parsing), supporting Inherited and Synthesized Attributes
• Implemented in C#
set.2005 gEPL / DI-UM 6
Compiler
• Generated by the Compilers Compiler– CoCo/R (C# implementation)
• LISS syntax and semantics was specified by and AG written in CoCoL
• Input files– Liss.ATG +SymbolTable.cs+VMcodegen.cs
• Output file– Liss.exe
set.2005 gEPL / DI-UM 7
The Programming Language LISSLISS stands for
Language of Integer, Sequences and Sets
Liss is a high-level toy language, appropriate to teach basic skills on imperative
(procedural) structured programming.
The language follows the verbose Pascal style, with long keywords (case-insensitive)
set.2005 gEPL / DI-UM 8
The Programming Language LISSLISS was designed with the main goal of being a
challenging case-study for Compiler courses.
With a proper Syntax (simplyfying the syntatic analysis), LISS has an unsual Semantic definition!
It requires a powerful Semantic Analysis, and a clever Machine-code Generation.
set.2005 gEPL / DI-UM 9
The Programming Language LISSThe design of LISS emphasizes:
• The static (compile time) and dynamic (run time) type checking
• The scope analysis• The code generation strategies (translation
schemas) to support non-standard Data-types, I/O and Control Statements
set.2005 gEPL / DI-UM 10
Errors Detectionprogram Errors
{
Declarations
a := 4, b -> Integer;
d := true, flag -> boolean;
array1 := [[1,2],[2,3]], vector -> Array size 4,3;
seq1 :=<<1,2,3,4>> -> Sequence;
Statements
// here we can't make "8/d", because "d" is boolean type
b = 6 + 8 * 5 - 8/d;
// "array1" it's a vector bi-dimensional, so we can't only give one index
a = array1[1];
// we can't give a boolean type to result of indexing array1
b = array1[0,0];
// must flag an error because final tail w'll be empty
a = head(tail(tail(tail(tail(seq1)))));
}
set.2005 gEPL / DI-UM 11
Variable DeclarationsVariables of any type can be initial declared.
All variables are initialized with default values (0,empty,false)
Initial values can be assigned in declarations
i, j, count=100 -> INT;
vec=[[[1,2,3],[4,5],[6]]] -> ARRAY size 4,3,6
lst1, lst2=<<9,8,7,6>>, lst=<<11>> -> SEQ;
Codes, C3, C2, C1={y|y>250} -> SET;
flag=True, exists, found=True -> Bool
set.2005 gEPL / DI-UM 12
Data Types: Integers• Algebraic (+ - * /) • Successor (suc) & Predecessor (pred)• Relational Operators ( = != < <= > >= )• I/O:
– Read
read( i )– Write
write( i ); writeLn( a*b/2 )
set.2005 gEPL / DI-UM 13
Data Types: Integersprogram IntegerTest {
Declarations
intA := 4, intB, intC := 6 -> Integer;
i, j, k -> Integer;
Statements
// arithmetic operations
intA = -3 + intB * (7 + intc);
writeLn(intA);
// input
read(i); read(j);
writeLn( i/j );
/* Inc/Dec operations */
writeLn( pred(INTc) ); writeLn( suc(INTc) );
}
set.2005 gEPL / DI-UM 14
Data Types : IntegersConstraints
- Division by zero is not defined
set.2005 gEPL / DI-UM 15
Data Types: Static Sequences(Multi-Dimensional Arrays)• Indexing
vec[i] = vec[2] + vec[j-4]
vec3[i,j,k] = i*j*k
• Assignment
vec2 = vec1
• Length length( vec2 )
• I/O:– Write
write( vec )
set.2005 gEPL / DI-UM 16
Data Types : Static SequencesConstraints
- Any index must be in the values range- The number of indexes must agree with dimension
set.2005 gEPL / DI-UM 17
Data Types: Static Sequencesprogram ArrayTest
{
Declarations
vector1 := [1,2,3], vector2 -> Array size 5;
array1 := [[1,2],[4]] -> Array size 4,2;
array2 := [
[[1],[5]],
[[2,2],[3]]
] -> Array size 4,3,2;
Statements
a = array2[1,2,3];
b = array2[1,0,a*2];
array2[2,b,a] = 15;
writeLn(array2);
vector2 = vector1;
}
set.2005 gEPL / DI-UM 18
Data Types: Dynamic Sequences• Linked Lists
– Empty List
– List with some elements
set.2005 gEPL / DI-UM 19
Data Types: Dynamic Sequences
Opers:
• Insert & Delete cons( 2,lst ); del( 2,lst )
• Head & Taili = head(lst); list = tail(lst);
• Member & IsEmpty
if ( isMember(2,lst) )…;
while ( isEmpty(lst) )…;
set.2005 gEPL / DI-UM 20
Data Types: Dynamic Sequences(cont.)• Indexing
lst[3]; names[2*i-j]• Length
length( lst );• Assignment
lst2 = lst1; copy( lstSrc , lstDest );• I/O:
– Write
write( lst )
set.2005 gEPL / DI-UM 21
Data Types: Dynamic Sequencesprogram Seq
{ Declarations
seq1 :=<<10,20,30,40,50>>, seq3 := <<1,2>>, seq2 -> Sequence;
Statements
// Selection Operations
a = head(tail(tail(seq1)));
seq2 = TAIL(seq3); seq1 = tail(seq1);
// add & delete an element of a sequence
cons(3*4+a,seq2); cons(a*int1*int2,seq2);
del(30,seq1);
// tests (empty & void)
b = isEmpty(seq2); writeLn(b);
if ( member(1,seq3) ) then writeLn(“Is member list"');
// indexing a sequence
int1 = seq1[2*head(tail(seq3))];
// write the sequence
seq3 = seq1; write(seq3);
}
set.2005 gEPL / DI-UM 22
Data Types : Dynamic SequenceConstraints
-
set.2005 gEPL / DI-UM 23
Data Types: Sets• Sets are:
1. Defined in comprehensionCodes = { x | x>=100 &&
x<500 }
2. Represented (in memory) in a binary tree
set.2005 gEPL / DI-UM 24
Data Types: Sets
• Union (++) & Intersection (**)
C3 = C1++C2; C3 = C1**C2
• Member (in)
while ( N in Codes )
• I/O:– Write
write( C3 )
set.2005 gEPL / DI-UM 25
Data Types: Setsprogram Sets {
Declarations
bool := true, flag, flag2 -> BOOLEAN;
e, f := { x | x > 7}, g := {x | x < 8 || x > 15 && x < 13 } -> Set;
Statements
-- sets
e = f ++ g;
f = g ** f;
g = g ** { x | x > 6 };
flag = a << e;
flag2 = a << g;
bool = 8 << {x | x < 10 && x > 7 };
}
set.2005 gEPL / DI-UM 26
Data Types: Booleans
• Boolean Operators
1. &&
2. ||
3. not
set.2005 gEPL / DI-UM 27
Data Types: Booleansprogram BoolTest {
Declarations
intA := 4, intB, intC := 6 -> Integer;
bool, flag := false -> booLEaN;
Statements
bool = intA < 8;
writeLn(bool);
/* logic operations */
flag = (intB != intA) && (intA > 7) || bool;
writeLn(flag);
bool = !( (intA == intB)||(intA != intC)&&(intC < 6) ) || flag;
writeLn(bool)
}
set.2005 gEPL / DI-UM 28
SubProgramsSubprograms, with zero zero or moremore parameters,
can becan be• Functions (return a value)• Procedures (don’t return a value)
can be declared • at the same Level• Nested (any deeper)
can be called any where they are visible.
set.2005 gEPL / DI-UM 29
SubProgramssubProgram calculate() :: integer
{
Declarations
res := 6 -> integer;
index -> INTeger;
subprogram factorial(n -> integer) :: integer
{
Declarations
res := 1 -> integer;
Statements
while (n > 0)
{
res = res * n;
n = n -1;
}
return res;
}
for ( index in 0…4 )
{
res = factorial(a);
writeLn(res);
}
}
set.2005 gEPL / DI-UM 30
SubPrograms
• On most top level, with previous example, we canwe can:
intA = calculate();
• But, we can’twe can’t:
• intA = factorial(6);
set.2005 gEPL / DI-UM 31
SubProgramssubprogram factorial(n -> integer) :: integer
{
Declarations
res := 1 -> integer;
Statements
while (n > 0)
{ res = res * n; pred(n); }
return res;
}
subProgram calculate(m -> array size 4) :: array size 4
{
Declarations
fac := 6 -> integer;
res := -16 -> integer;
Statements
for (a in 0..3) stepUP 1
{ m[a] = factorial(fac + a); }
return m;
}
set.2005 gEPL / DI-UM 32
Control Statements
• If () Then {} [ Else {} ]
1. if d == true then { if( flag ) tHen { a = 6;} else { b = 9; } }
2. if !(c[2]==0) then
{ a = 5;
write(a);
}
else {
b=7;
c[5] = 5;
write(b);
}
set.2005 gEPL / DI-UM 33
Control Statements
• While () {}
1. while(isMember(10,seq2)){
delete(10,seq2); writeLn(seq2);
}
2. while(length(array) != 10){
array[i] = i; suc(i);
}
set.2005 gEPL / DI-UM 34
Control Statements
For
For i in v1 [..v2 [stepup/stepdown N]]
[satisfying Exp]
set.2005 gEPL / DI-UM 35
VM Architecture
• Virtual Machine with:
1. Instructions Stack ( program)
2. Calling stack - save pointers pairs (i,f):• i – save pc
• f – save fp
3. Execution stack ( global/local/working memory)
4. Two Heaps
5. Four registers (pc, sp, fp, gp)
set.2005 gEPL / DI-UM 36
VM Architecture
set.2005 gEPL / DI-UM 37
VM - Instruction Set
• Data Transfer– Push / Load– Pop– Store– Alloc / Free
• IO– Read – Write
set.2005 gEPL / DI-UM 38
VM - Instruction Set
• Control– Jump– Call– Return
• Miscellaneous– Type Conversion– Check– Start – Stop– Err
set.2005 gEPL / DI-UM 39
Project Documentation• Technical Report on LISS Compiler Development,
written in NoWeb, includes– LISS Specification (AG in CoCol)+ – Sample LISS Programs (Tests)– Compiler’s Internal Data Structures – Target Machine Description (VM)– Translation Schemas