Compiler development is a very difficult task. But, fortunately, with the development of projects like LLVM, the solution to this problem is greatly simplified, which allows even a lone programmer to create a new language that is close in performance to C. Working with LLVM is complicated by the fact that this system is represented by a huge amount of code, provided with little documentation . In order to try to fix this shortcoming, the author of the material, the translation of which we are publishing today, is going to demonstrate code examples written in Go and show how they are first translated into
First example
The first function I'm going to look at here is a simple mechanism for adding numbers:
func myAdd(a, b int) int{
return a + b
}
This function is very simple, and, perhaps, nothing simpler can be invented. It translates to the following Go SSA code:
func myAdd(a int, b int) int:
entry:
t0 = a + b int
return t0
With this representation of the function, hints about data types are placed on the right, in most cases they can be ignored.
This small example already allows you to see the essence of one of the aspects of SSA. Namely, when converting the code to the SSA form, each expression is broken down into the most elementary parts of which it consists. In our case, the command return a + b
is actually two operations: adding two numbers and returning the result.
In addition, here you can see the basic blocks of the program, in this code there is only one block - the input (entry block). We will talk more about blocks below.
The Go SSA code is easily converted to LLVM IR:
define i64 @myAdd(i64 %a, i64 %b) {
entry:
%0 = add i64 %a, %b
ret i64 %0
}
You can see that although other syntactic constructions are used here, the structure of the function has basically not changed. The LLVM IR code is a little stronger than the Go SSA code, similar to C. Here, in the function declaration, the data type returned to it is first described, the argument type is indicated before the argument name. In addition, to simplify IR parsing, the names of global entities are preceded by the symbol @
, and before the local names - the symbol %
(a function is also considered a global entity).
One thing to note about this code is that the decision to represent a Go type int
, which can be a 32-bit or 64-bit value, depending on the compiler and the target of compilation, is accepted when generating the LLVM IR code. This is one of the many reasons why the LLVM IR code is not, as many people think, platform-independent. Such code, created for one platform, cannot simply be taken and compiled for another platform (unless you approach this task
Another interesting point worth noting is that the type i64
is not a signed integer: it is neutral in terms of representing the sign of the number. Depending on the instruction, it can represent both signed and unsigned numbers. In the case of the representation of the addition operation, this does not matter, so there is no difference in working with signed or unsigned numbers. Here I would like to note that in the C language, overflow of a signed integer variable leads to undefined behavior, so the Clang frontend adds a flag to the operation nsw
(no signed wrap), which tells LLVM that it can assume that addition never overflows.
This may be important for some optimizations. For example, adding two values i16
on a 32-bit platform (with 32-bit registers) needs, after performing an addition, a sign extension operation in order to stay in the range i16
. Because of this, it is often more efficient to perform integer operations given the machine size of the register.
What happens next with this IR code is not of particular interest to us now. The code is optimized (but in the case of such a simple example as ours, nothing is already optimized), and then converted to machine code.
Second example
The next example we'll look at is a bit more complicated. Namely, we are talking about a function that sums a slice of integers:
func sum(numbers []int) int {
n := 0
for i := 0; i < len(numbers); i++ {
n += numbers[i]
}
return n
}
This code translates to the following Go SSA code:
func sum(numbers []int) int:
entry:
jump for.loop
for.loop:
t0 = phi [entry: 0:int, for.body: t6] #n int
t1 = phi [entry: 0:int, for.body: t7] #i int
t2 = len(numbers) int
t3 = t1 < t2 bool
if t3 goto for.body else for.done
for.body:
t4 = &numbers[t1] *int
t5 = *t4 int
t6 = t0 + t5 int
t7 = t1 + 1:int int
jump for.loop
for.done:
return t0
Here you can already see more constructs that are typical for representing code in the form of SSA. Probably the most obvious feature of this code is the fact that there are no structured commands to control the flow of computation. To control the flow of computation, there are only conditional and unconditional jumps, and if we consider this command as a command to control the flow, the return command.
In fact, here you can pay attention to the fact that the program is not divided into blocks using curly braces (as in the languages of the C family). It is separated by labels, which is reminiscent of assembly languages, and is presented in the form of basic blocks. In SSA, basic blocks are called contiguous sequences of code starting with a label and ending with basic block completion instructions, for example − return
и jump
.
Another interesting detail of this code is provided by the instruction phi
. The instruction is quite unusual, it may take some time to figure it out. remember, that myAdd
shown above, but is not suitable for more complex functions like the function discussed in this section sum
. In particular, during the execution of the loop, the variables change i
и n
.
SSA circumvents the restriction on assigning values to variables once using a so-called instruction phi
(its name is taken from the Greek alphabet). The fact is that in order for the SSA representation of the code to be formed for languages like C, you have to resort to some tricks. The result of calling this instruction is the current value of the variable (i
or n
), and the list of base blocks is used as its parameters. For example, consider this statement:
t0 = phi [entry: 0:int, for.body: t6] #n
Its meaning is as follows: if the previous base block was a block entry
(input), then t0
is a constant 0
, and if the previous base block was for.body
, then you need to take the value t6
from this block. All this may look rather mysterious, but thanks to this mechanism, the work of SSA is ensured. From a human point of view, all this makes the code harder to understand, but the fact that each value is assigned only once greatly simplifies many optimizations.
Note that if you write your own compiler, then you usually don't have to deal with this sort of thing. Even Clang doesn't generate all these instructions phi
, it uses the mechanism alloca
(it resembles working with ordinary local variables). Then, when running an LLVM optimization pass called alloca
converted to the SSA form. TinyGo, however, receives input from the Go SSA, which, conveniently, has already been converted to SSA form.
Another novelty of the considered fragment of the intermediate code is that access to the elements of the slice by index is represented as an operation for calculating the address and an operation for dereferencing the resulting pointer. Here you can see the direct addition of constants to the IR code (for example - 1:int
). In the example with the function myAdd
this has not been used. Now, having dealt with these features, let's take a look at what this code will turn into when it is converted to the LLVM IR form:
define i64 @sum(i64* %ptr, i64 %len, i64 %cap) {
entry:
br label %for.loop
for.loop: ; preds = %for.body, %entry
%0 = phi i64 [ 0, %entry ], [ %5, %deref.next ]
%1 = phi i64 [ 0, %entry ], [ %6, %deref.next ]
%2 = icmp slt i64 %1, %len
br i1 %2, label %for.body, label %for.done
for.body: ; preds = %for.loop
%3 = getelementptr i64, i64* %ptr, i64 %1
%4 = load i64, i64* %3
%5 = add i64 %0, %4
%6 = add i64 %1, 1
br label %for.loop
for.done: ; preds = %for.loop
ret i64 %0
}
Here, as before, we can see the same structure, which includes other syntactic constructions. For example, in calls phi
swapped values and labels. However, there is something here that is worth paying special attention to.
For starters, here you can see a completely different function signature. LLVM does not support slices, as a result, as an optimization, the TinyGo compiler that generated this intermediate code divided the description of this data structure into parts. It could represent three slice elements (ptr
, len
и cap
) as a structure (struct), but representing them as three separate entities allows for some optimizations. Other compilers may represent the slice in some other way, depending on the target platform's function calling conventions.
Another interesting feature of this code is the use of the instruction getelementptr
(often abbreviated as GEP).
This instruction works with pointers and is used to get a pointer to a slice element. For example, let's compare it to the following code written in C:
int* sliceptr(int *ptr, int index) {
return &ptr[index];
}
Or with the following equivalent to this:
int* sliceptr(int *ptr, int index) {
return ptr + index;
}
The most important thing here is that the instruction getelementptr
does not perform dereference operations. It just calculates a new pointer based on the existing one. It can be taken as instructions mul
и add
at the hardware level. You can read more about the GEP instruction
Another interesting feature of this intermediate code is the use of the instruction icmp
. This is a general purpose instruction used to implement integer comparison. The result of executing this instruction is always a value of type i1
is a boolean value. In this case, the comparison is made using the keyword slt
(signed less than), since we are comparing two numbers previously represented by the type int
. If we were comparing two unsigned integers, then we would use icmp
, and the keyword used in the comparison would be ult
. To compare floating point numbers, another instruction is used, fcmp
, which works in a similar way.
Results
I believe that in this article I have considered the most important features of LLVM IR. Of course, there is a lot more here. In particular, in the intermediate representation of the code, there can be many annotations that allow taking into account certain features of the code, known to the compiler, that cannot be expressed in IR in another way, during optimization passes. For example, this is the flag inbounds
GEP instructions, or flags nsw
и nuw
, which can be added to the instructions add
. The same goes for the keyword. private
, indicating to the optimizer that the function marked by it will not be referenced from outside the current compilation unit. This allows for many interesting inter-procedural optimizations, such as eliminating unused arguments.
Details about LLVM can be found in
Dear Readers, Do you use LLVM?
Source: habr.com