Expand description
§LIR (Low Intermediate Representation) Module
This module contains a decently sized typechecked intermediate representation for the virtual machine. The LIR, unlike the VM and the assembly language, is not split into two variants: there is only one variant. The LIR compiler will generate core assembly code by default for the LIR, but will revert to the standard variant if unsupported instructions or types are encountered (such as floating point operations or float types).
§Index
§Purpose
The purpose of the LIR is to provide a powerful backend for the next stage of IR. Most of the heavy lifting of the actual compilation process is done by this stage of the compiler: typechecking, constant folding, compiling to assembly, dealing with the two variants of the virtual machine, and so on. The stages of IR above this simply implement features like macros and tagged-unions: very simple things which map 1:1 with generated LIR code.
§Features
- The Type System
LIR’s type system is very good for compiling directly to assembly. LIR supports the following types:
- None (the void type)
- Never (the type of an expression which never returns a value: such as a
return
expression) - Int (a signed integer)
- Float (a floating point number)
- Bool (a boolean value)
- Char (a single character)
- Cell (the most basic unit of memory)
- Pointer (a pointer to another given type)
- Array (an array with elements, with a constant size expression)
- Proc (a procedure with a list of arguments and a return type)
- Tuple (a tuple with a list of distinct types. this is the product type)
- Struct (a tuple with named fields)
- Union (a union of types. this is the sum type when combined with a tag)
- Enum (an enumeration with a list of variants. this is like a C enum, not a Rust enum)
- Let (a type which allows the user to bind a type under a given name in another type expression)
The Let type is extremely powerful, allowing users to create recursive types inline: without
binding them to a name under a LetType expression. Additionally, Let types are checked for equality
structurally, and this also works for comparing recursive types. There are many examples of this in tests/lir.rs
.
- The Constant Folding
LIR also provides constant expressions to allow the user to do as much as possible during compile time. This also makes it simpler to do compile-time optimizations.
- Expressions
The expressions that LIR uses to represent the program are very simple, and very powerful. Arrays are kept distinct from Pointers (unlike in C), and so expressions can return stack allocated arrays without a problem. Arrays can also be indexed without a pointer to the array, and so on. LIR supports getting members of tuples, structs, and unions, and also getting their references as well.
- Compilation Process
LIR is designed to be able to compile as much as possible to the core variant of the assembly language. As long as you don’t
use floating point operations or standard builtins (alloc
and free
), you can compile to the core variant. Recursive types,
inlined recursive types, mutually recursive types, recursive functions, and core builtins are all supported without a problem.
The LIR compiler will only use a standard instruction if it has to.
Structs§
- A boolean “And” operation between two values.
- An assignment operation. This is used to implement assignment operators like
+=
. - A boolean “BitwiseAnd” operation between two values.
- A boolean “BitwiseNand” operation between two values.
- A boolean “BitwiseNor” operation between two values.
- A boolean “BitwiseOr” operation between two values.
- A boolean “BitwiseXor” operation between two values.
- A builtin pseudo-procedure implemented in the core assembly variant.
- Get the Union data associated with a tagged union (EnumUnion).
- An environment under which expressions and types are compiled and typechecked. This is essentially the scope of an expression.
- A typed procedure which calls a foreign function. This is compiled down to a standard assembly
Call
instruction. The label is the name of the foreign function. The types determine the size of the cells for the arguments and return value. - A boolean “Not” operation on a value.
- A boolean “Or” operation between two values.
- A polymorphic procedure of LIR code which can be applied to a list of arguments with type arguments. This is mono-morphed into a
Procedure
when it is called with a list of type arguments. A procedure is compiled down to a label in the assembly code. - A monomorphic procedure of LIR code which can be applied to a list of arguments. A procedure is compiled down to a label in the assembly code. The label is called with the
Call
instruction. - A builtin pseudo-procedure implemented in the standard assembly variant.
- Get the Enum value of the tag associated with a tagged union (EnumUnion).
Enums§
- An annotation for metadata about an LIR expression. This is used for error reporting, debugging, optimization, and for representing the LIR in a human-readable format.
- An arithmetic operation.
- A comparison operation between two values.
- A compiletime expression.
- A declaration of a variable, function, type, etc.
- An LIR compilation error.
- TODO: Add variants for
LetProc
,LetVar
, etc. to support multiple definitions. This way, we don’t overflow the stack with several clones of the environment. A runtime expression. - Mutability of a pointer. This is used to provide type safety for pointers. A mutable pointer can be used to mutate the value it points to. An immutable pointer can only be used to read the value it points to. An
Any
pointer can be used to read or write the value it points to; this is used to override pointer access protections for compiler-builtins. - A pattern which can be matched against an expression.
- Print a value to a given output.
- The representation of a type in the LIR type system.
Traits§
- A trait used to implemented an assignment operation.
- A trait used to implement a binary operation.
- A trait which allows an LIR expression to be compiled to one of the two variants of the assembly language.
- Get the size of something in memory (number of cells).
- Get the type associated with a value under a given environment.
- Simplify an expression while maintaining structural equality.
- A trait used to implement a ternary operation.
- A trait used to enforce type checking.
- A trait used to implement a unary operation.