Algorithm designers have always endeavored to optimize their programs. One time-honored optimization, especially for numeric programming, is loop unrolling, a technique that minimizes loop overhead. The quintessential loop-unrolling example is matrix multiplication. The following function multiplies a matrix and a vector. (The constants rows and cols have been previously defined.):.
void mult(int a[rows][cols], int x[cols], int y[cols]) {
for (int i = 0; i < rows; ++i) {
y[i] = 0;
for (int j = 0; j < cols; ++j)
y[i] += a[i][j]*x[j];
}
}
If cols is an even number, the overhead of incrementing and comparing the loop control variable j can be cut in half by "unrolling" the computation into pairs in the inner loop:.
void mult(int a[rows][cols], int x[cols], int y[cols]) {
for (int i = 0; i < rows; ++i) {
y[i] = 0;
for (int j = 0; j < cols; j += 2)
y[i] += a[i][j]*x[j] + a[i][j+1]*x[j+1];
}
}
In general, if cols is a factor of k, k operations can be performed each time the inner loop iterates, greatly reducing the overhead. The savings is only noticeable on large arrays, but that is precisely the case with industrial-strength mathematical computations.
Function inlining also constitutes a form of loop unrolling. Consider the following approach to computing powers of integers.
//: C05:Unroll.cpp
// Unrolls an implicit loop via inlining
#include
using namespace std;
template
inline int power(int m) {
return power
}
template<>
inline int power<1>(int m) {
return m;
}
template<>
inline int power<0>(int m) {
return 1;
}
int main()
{
int m = 4;
cout << power<3>(m) << endl;
} ///:~.
Conceptually, the compiler must generate three specializations of power<>, one each for the template parameters 3, 2, and 1. Because the code for each of these functions can be inlined, the actual code that is inserted into main( ) is the single expression m*m*m. Thus, a simple template specialization coupled with inlining here provides a way to totally avoid loop control overhead.[69] This approach to loop unrolling is limited by your compiler’s inlining depth, of course.
Compile-time selection
To simulate conditionals at compile time, you can use the conditional ternary operator in an enum declaration. The following program uses this technique to calculate the maximum of two integers at compile time.
//: C05:Max.cpp
#include
using namespace std;
template
struct Max {
enum {val = n1 > n2 ? n1 : n2};
};
int main() {
cout << Max<10, 20>::val << endl; // 20
} ///:~
If you want to use compile-time conditions to govern custom code generation, you can once again use specializations of the values true and false:
//: C05:Conditionals.cpp
// Uses compile-time conditions to choose code
#include
using namespace std;
template
struct Select {};
template<>
struct Select
static inline void f() { statement1(); }
private:
static inline void statement1() {
cout << "This is";
cout << " statement1 executing\n";
}
};
template<>
struct Select
static inline void f() { statement2(); }
private:
static inline void statement2() {
cout << "This is";
cout << " statement2 executing\n";
}
};
template
void execute() {
Select
}
int main() {
execute
} ///:~.
This program is the equivalent of the expression:
if (cond)
statement1();
else
statement2();
except that the condition cond is evaluated at compile time, and the appropriate versions of execute<>( ) and Select<> are instantiated by the compiler. The function Select<>::f( ) executes at runtime, of course. A switch statement can be emulated in similar fashion, but specializing on each case value instead of the values true and false.
Compile-time assertions
In Chapter 2 we touted the virtues of using assertions as part of an overall defensive programming strategy. An assertion is basically an evaluation of a Boolean expression followed by a suitable action: do nothing if the condition is true, or halt with a diagnostic message otherwise. The previous section showed how to evaluate compile-time Boolean expressions. The remaining challenge in emulating assertions at compile time is to print a meaningful error message and halt. All that is required to halt the compiler is a compile error; the trick is to insert helpful text in the error message. The following example of Alexandrescu[70] uses template specialization, a local class, and a little macro magic to do the job.
//: C05:StaticAssert.cpp
//{-g++}
#include
using namespace std;
// A template and a specialization
template
struct StaticCheck {
StaticCheck(...);
};
template<>
struct StaticCheck
// The macro (generates a local class)
#define STATIC_CHECK(expr, msg) { \
class Error_##msg{}; \
sizeof((StaticCheck
}
// Detects narrowing conversions
template
To safe_cast(From from) {
STATIC_CHECK(sizeof(From) <= sizeof(To),
NarrowingConversion);
return reinterpret_cast
}
int main() {
void* p = 0;
int i = safe_cast
cout << "int cast okay\n";