Exist Embedded DevelopmentThe size and operating efficiency of the code are very important. The size of the code often corresponds to the FLASH and RAM capacity of the chip. The operating efficiency of the program is also required to be in the corresponding capacity.processorRun on. In most cases, mature developers hope to reduce the size of the code and improve the efficiency of code operation, but what should be done? This article will take the compiler of the internationally renowned compiler manufacturer IAR Systems as an example to answer the questions that developers often encounter in actual work. Engineers can verify in practice on the IAR compiler.
for Embedded Systems, The size and efficiency of the final code depends on the executable code generated by the compiler, not the source code written by the developer; but the optimization of the source code can help the compiler generate better executable code. Therefore, developers must not only conceive the source code system from the overall efficiency and other factors, but also pay close attention to the performance of the compiler and the convenience of compilation and optimization.
Compilers with optimization functions can generate small and fast executable code. The compiler achieves optimization by repeatedly converting the source code. Usually, compiler optimization will follow a sound theoretical basis of mathematics or logic. However, some compilation optimizations are through heuristic methods. Experience has shown that some code conversions often produce better code or open up room for further compilation optimization.
Only a few cases of compilation optimization rely on the black technology of the compiler. Most of the time, the way the source code is written determines whether the program can be optimized by the compiler. In some cases, even minor changes to the source code can have a major impact on the efficiency of the code generated by the compiler.
This article will talk about the things that need to be paid attention to when writing code, but we should first make it clear that we do not need to minimize the amount of code, because even if it is used in an expression? :-Expressions, post-increment and comma expressions to eliminate side effects will not make the compiler produce more efficient code. This will only make your source code obscure and difficult to maintain. For example, adding a post-increment or assignment in the middle of a complex expression can easily be overlooked when reading the code. Please try to write code in an easy-to-read style.
cycle
Will the following seemingly simple loop report an error?
for (i = 0; i != n; ++i)
{
a[i] = b[i];
}
Although no errors will be reported, there are several points that will affect the efficiency of the code generated by the compiler.
For example, the type of the index variable should match the pointer.
An array expression like a[i] is actually *(&a[0]+i*sizeof(a[0]), or in layman’s terms: add the offset of the i-th element to the first one of a Element pointer. For pointer arithmetic, the type of the index expression is best to be consistent with the type pointed to by the pointer (except for the __far pointer, because the type pointed to by the pointer is different from the type of the index expression). If the type of the index expression Does not match the type pointed to by the pointer, so before adding it to the pointer, it must be cast to the correct type.
If in the application, the stack space resources (the stack is generally placed in RAM) are more precious than the code size resources (the code is generally placed in ROM or Flash), you can choose a smaller type for the index variable to reduce the use of stack space , But this often sacrifices code size and execution time (the code size becomes larger and the execution time becomes slower). Not only that, this conversion will also hinder the optimization of the loop code.
In addition to the above problems, we also need to pay attention to the loop conditions, because loop optimization can only be performed if the number of iterations can be calculated before entering the loop. However, this calculation is very complicated. It is not as simple as subtracting the initial value from the final value and dividing by the increment. For example, if i is an unsigned character, n is an integer, and the value of n is 1000, what happens? The answer is that the variable i will overflow before it reaches 1000.
Although the programmer certainly does not want an infinite loop that repeatedly copies 256 elements from b to a, the compiler cannot understand the programmer’s intention. It must assume the worst-case scenario and cannot apply optimizations that require the number of strokes to be provided before entering the loop. In addition, if the final value is a variable, you should also avoid using the relational operators <= and >= in the loop condition. If the loop condition is i <= n, then n may be the highest value that can be represented in the type, so the compiler must assume that this is a potentially infinite loop.
Alias
Generally, we do not recommend using global variables. This is because you can modify the global variable anywhere in the program, and the program will change due to the value of the global variable. This will form a complex dependency relationship, making it difficult for people to understand the program, and it is also difficult to determine how changing the value of a global variable will affect the program. From the optimizer’s point of view, this situation is even worse, because the value of any global variable can be changed through pointer storage. If a variable can be accessed in multiple ways, this situation is called an alias, and the alias makes the code more difficult to optimize.
char *buf
void clear_buf()
{
int i;
for (i = 0; i < 128; ++i)
{
buf[i] = 0;
}
}
Although the programmer knows that writing to the buffer pointed to by buf will not change the buf variable itself, the compiler still has to prepare for the worst, starting from each iteration of the loopRAMReload buf in.
If you pass the address of the buffer area as a parameter instead of using a global variable, you can eliminate the alias:
void clear_buf(char *buf)
{
int i;
for (i = 0; i < 128; ++i)
{
buf[i] = 0;
}
}
After using this solution, the pointer buf will not be affected by the storage through the pointer. In this way, the pointer buf can remain unchanged in the loop, and its value only needs to be loaded once before the loop, instead of being reloaded every iteration.
However, if you need to pass information between code segments that do not share the caller/callee relationship, you can use global variables directly. However, for computationally intensive tasks, especially when pointer operations are involved, it is best to use automatic variables.
Try not to use post increment and post decrement
In the following, everything about post increments also applies to post decrements. The standard text on post-increment semantics in the C language states: “The result of the suffix + + operator is the value of the operand. After the result is obtained, the value of the operand will increase.”AlthoughMicrocontrollerThere are universal addressing modes that can increase pointers after load or store operations, but few of them can handle other types of post-increment with the same efficiency. To comply with the standard, the compiler must copy the operand to a temporary variable before performing the increment. For straight-line code, you can take the increment from the expression and place it after the expression. For example, the following expression:
foo = a[i++];
Can be changed to
foo = a[i];
i = i + 1;
But what happens if the post-increment is a condition in the while loop? Since there is no place to insert the increment after the condition, the increment must be added before the test. For these common designs that are closely related to the efficiency of generating executable code, tools such as Embedded Workbench from IAR Systems provide optimization solutions after summarizing a large number of practices.
For example, the following loop
i = 0;
while (a[i++] != 0)
{
...
}
Should be changed to
loop:
temp = i; /* save the value of the operand*/
i = temp + 1; /* Increment operand */
if (a[temp]== 0) /* Use the saved value**/
goto no_loop;
...
goto loop;
no_loop:
or
loop:
temp = a[i]; /* Use the value of the operand*/
i = i + 1; /* increment operand */
if (temp == 0)
goto no_loop;
...
goto loop;
no_loop:
If the value of i after the loop is not relevant, it is better to put the increment inside the loop.For example, the following almost identical loop
i = 0;
while (a[i] != 0)
{
++i;
...
}
Can be executed without temporary variables:
loop:
if (a[i] == 0)
goto no_loop;
i = i + 1;
...
goto loop;
no_loop:
Developers of optimizing compilers know very well that post-increment will make code writing more complicated. Although we have tried our best to identify these patterns and eliminate temporary variables as much as possible, there are always situations that prevent us from generating effective code, especially When encountering more complicated cycle conditions than the above. Usually, we split a complex expression into several simpler expressions, just like the loop condition above is split into a test and an increment.
In the C++ environment, it is more important to choose between pre-increment or post-increment. This is because operator + + and operator-can be overloaded in the form of prefix and suffix. When overloading an operator as a class object, although it is not necessary to imitate the behavior of the basic type operator, it should be as close as possible. Therefore, for those classes that can intuitively increment and decrement objects, such as iterators, there are usually prefixes (operator++() and operator--()) and suffix forms (operator++(int) and operator--(int) ).
In order to simulate the behavior of the basic type prefix + +, operator + + () can modify the object and return a reference to the modified object. What about the behavior of simulating the suffix + + of the basic type? Do you remember? “The result of the suffix + + operator is the value of the operand. After the result is obtained, the value of the operand will increase.” Just like the non-linear code above, the implementer of operator + + (int) must copy the original object, modify the original object, and return the copy by value. Due to the existence of the copy operation, the overhead of operator + + (int) is higher than that of operator + + ().
For basic types, if the result of i++ is ignored, the optimizer can usually eliminate unnecessary copying, but the optimizer cannot change the call to one overloaded operator into another. If you write i++ instead of ++i out of habit, you will call the more expensive increment operator.
Although we have been opposed to using post-increment, we have to admit that post-increment is still useful in some situations. If you really want to post-increment a variable, then continue. If the post-increment operation is consistent with the operation you expect, you can use the post-increment operation. But please note, do not use post-increment operation to avoid writing one more line of code to increment variables.
Whenever you are looping conditions, if conditions, switch expressions,? :-Adding unnecessary post increments to expressions or function call parameters will force the compiler to generate larger and slower code. Is this list too long to remember? Start developing good habits today! Before using the post-increment operation, ask yourself if you can use the incremental operation as the next sentence.
Concluding remarks
Of course, software development work does not only require developers to “adopt” the compiler. The collaboration between them and the compiler is one of the foundations for fast and efficient programming. In addition, from the perspective of the development process of compilers, they must not only iterate and innovate following the evolution of technology and language, but also extensively refer to more development habits. Those with longer history and more widely used compilers can be used for development. People bring higher efficiency.
Therefore, after knowing how to write code that is conducive to a good compiler optimization, users can get twice the result with half the effort. The principles and tips mentioned in this article are also the best practices summarized for a long time by companies like IAR Systems, and they can all be verified and explored in the company’s Embedded Workbench, and the execution time of the code can be viewed in its tool interface. And code size to find the best solution.
In addition to general code compilation and optimization, good tools also support highly flexible custom optimization settings. For example, IAR Embedded Workbench includes different optimization levels for operating efficiency and code size. For different application requirements, you can also set the entire project to Each source code file, even the optimization level of each function, helps engineers adapt the best optimization scheme for their applications. I hope this article is helpful for developers to have a deeper understanding of program optimization.