Member function pointers

If you have read Function pointers, you may be wondering about how you might use function pointers to call the member functions of structs and classes. And you would be right to wonder. Regular function pointers will not work. What we need for this task is member function pointers, aka pointers to members functions. Strictly speaking, this only relates to non-static member functions, since static member functions do work with regular function pointers.

Pointers to member functions are quite strange beasts which are only superficially similar to regular function pointers. I think they are one of the nifty cool features of C++, but sadly they seem to be not well understood by a lot of C++ developers.

This article builds on the previous article on function pointers, and aims to explain member function pointers a little, and how they can be used to integrate C++ code with C callback functions.

How member functions are implemented

To understand member function pointers and why they are so different from regular function pointers, it may help to understand a little bit about how classes are actually implemented by the compiler. Suppose we have some code that looks like this:

// Simple class with a few member functions.
class Thing
{
public:
    // Constructor.  
    Thing(int offset); // : m_offset{offset} {}
    // Non-static member functions.
    int add(int a, int b); // { return a + b + m_offset; } 
    int mul(int a, int b); // { return a * b + m_offset; }
    // Static member function.
    static int div(int a, int b); // { return a / b; }
private:
    int m_offset;
};

Thing::Thing(int offset) 
: m_offset{offset} 
{
}

int Thing::add(int a, int b) 
{ 
    return a + b + m_offset; 
} 

int Thing::mul(int a, int b) 
{ 
    return a * b + m_offset; 
}

int Thing::div(int a, int b) 
{ 
    return a / b; 
}

// Using the class.
int test()
{
    Thing t{12};
    return t.add(13, 14);
}

I could have inlined the member functions inside the class to make the code shorter (see comments), but wanted it to closely resemble the C that follows. The code is implemented (more or less) by the compiler as if you had written C something like this:

// Simple class with a few associated functions.
// Not official C as no typedef 'tag' malarky here.
struct Thing
{
    // This throws away the notion of 'private'
    int m_offset;
};

void Thing_Thing(Thing* this, int offset)
{
    this->m_offset = offset;
} 

int Thing_add(Thing* this, int a, int b) 
{
    return a + b + this->m_offset;
}

int Thing_mul(Thing* this, int a, int b) 
{
    return a * b + this->m_offset;
}

int Thing_div(int a, int b) 
{ 
    return a / b; 
}

int test()
{
    Thing t;
    Thing_Thing(&t, 12);
    return Thing_add(&t, 13, 14);
}

The compiler automatically inserts an additional parameter called this which holds a pointer to the instance of the object on which the function is called. That is to say, my_thing.add(13, 19) would effectively be implemented as Thing_add(&my_thing, 13, 19). It’s more complicated when there are virtual functions or multiple inheritance, but that’s the basic idea. [Interestingly, this means you can call a member function without an actual object: a null pointer is sufficient – the compiler doesn’t care as long as the pointer has the correct type. this would be nullptr or, worse, uninitialised. Don’t do this on purpose. :)]. this is a keyword in C++, so the “as if” code won’t actually compile unless you replace it with another name.

The made up “as if” function names are not that far from reality. The compiler will invent a name each the function based on the namespace, the class name, the parameter types, and other factors. This is what we know as name-mangling. To you it looks like Thing::add; to the linker it looks like _ZN5Thing3addEii (from GCC), or something equally ugly. There is no ambiguity. Incidentally, this is also how function overloading works: different signatures lead to different names. Debuggers usually do a good job of de-mangling the names so you don’t have to scratch your head too much.

So, in principle, a member function pointer is just like a regular function pointer. The member function pointer for Thing::add() can be imagined to hold the address of Thing_add(). But unfortunately this is not enough information to make the call work.

  • The key thing to note is that when we use the member function pointer we will need somehow to supply a value for the this parameter in addition to values for the other parameters to the function. This is very different from plain function pointers.
  • And it gets more complicated when virtual functions and multiple inheritance are involved. I thought it was quite something when I discovered that a pointer to a virtual member function will do the right thing.
  • I don’t know this, but the compiler will presumably need to know which class the function pointer relates to, Thing in this case, so that functions of other classes with similar signatures can be rejected as being of different types (I’m glossing over the type of this here). I captured the class name in the “as if” function names, but I imagine the compiler wants a link to some detailed type info or something.
    • Note that a pointer to a member function of some class is also usable with matching member functions of derived classes.

In truth, the way member function pointers are actually represented internally is implementation defined. That means each compiler can do whatever it wants, so long as it works. In many cases, they may actually be simple function pointers as suggested above. Or they might be structs with several data members containing all the necessary information, or something completely different. They are black boxes that we will soon learn how to use, but shouldn’t look inside. The sizes of member function pointers can even be different for different classes, which was an unpleasant surprise when I found out. It’s really best not to think of them as holding meaningful information or function addresses at all. Just treat them as opaque gizmos that you can somehow use to invoke a member function of a particular object.

Note that the static member function Thing::div() does not have a this parameter. It doesn’t need one because static functions aren’t tied to any particular instance of a class, but can be called on any object, or even without any object. If you did want to give a static function access to a particular object, you’d have to add a Thing* parameter of your own. For this reason, static member functions can be used with regular function pointers.

Aliases for member function pointer types

Following on from the discussion of signatures for regular function pointers here, the signature of a member function pointer is very similar, but includes the type of the class or struct as well. Suppose we have the following class:

class Gadget
{
public:
    void foo(int, double);    
    void bar(int, double);    
};

In this code Gadget::foo() and Gadget::bar() have the same signature, and it is void (Gadget::*)(int, double). This is basically identical to the signature for a free function which takes int and double arguments and returns void, but also includes the name of the class and the scope resolution operator before the star. Declaring variables of member function pointers types is straightforward, but ugly. Invoking them to call functions is worse:

// Gadget::foo() is a function.
void Gadget::foo(int arg1, double arg2)
{
     // ...
}
 
// mem_fun_ptr is a member function pointer.
void (Gadget::*mem_fun_ptr)(int, double);
 
// mem_fun_ptr now hold the "address" of Gadget::foo().
mem_fun_ptr = &Gadget::foo;
 
Gadget gadget;
// Calls Gadget::foo(). See below.
(gadget.*mem_fun_ptr)(1, 2); 

We can declare member function pointer types (that is, aliases for the function signature) in much the same ways as for regular function pointers.

Typedef

// Direct variable declaration - avoid!!
// mem_fun_ptr is a variable: it is a  
// pointer to member function of type
// void (Gadget::*)(int, double).
void (Gadget::*mem_fun_ptr)(int, double);

// C style type alias - avoid.
// OldMemFunPtr is an alias for the pointer to 
// member function type void (Gadget::*)(int, double). 
typedef void (Gadget::*OldMemFunPtr)(int, double);

Type alias (C++11)

// C++11 style type alias - preferred.
// NewMemFunPtr is equivalent to OldMemFunPtr. 
using NewMemFunPtr = void (Gadget::*)(int, double);

Trailing return type (C++11)

// C++11 style type alias - trailing return type.
// NewMemFunPtr is equivalent to OldMemFunPtr. 
using TrailingMemFunPtr = auto (Gadget::*)(int, double) -> void;

As with regular function pointers, member function pointers can appear in the same places as other data types: as local variables, member variables, in containers, and so on. Aside from calling functions, compared for equality (or non-equality), but few other operators make any sense with them, including other relational operators, and probably won’t compile.

Using a member function pointer variable

So far we have covered what member function pointers are and how to alias them with simple names to hide the rather unpleasant syntax, but we haven’t actually used them for much function pointing. This is very simple: you just assign the function pointer with address of a function, or the value of another function pointer, and then you use the function pointer to call the function of whose address it currently holds:

class Maths 
{ 
    void Maths::square(int x) 
        { std::cout << x << " squared is " << (x*x); }
    void Maths::cube(int x)   
        { std::cout << x << " cubed is " << (x*x*x); }
};
 
// Create an alias for the function pointer.
using MemFuncPtr = void (Maths::*)(int);
 
int test()
{
    Maths  maths;

    MemFuncPtr ptr = &Maths::cube;  
    // Read as maths.function(3) where
    // "function" is "deference ptr".
    (maths.*ptr)(3); // Calls cube(3) 
 
    // Call with a Maths object
    ptr = &Maths::square; // & mandatory 
    (maths.*ptr)(5); // Calls square(5)
 
    MemFuncPtr ptr2 = ptr; 
    (maths.*ptr2)(5); // Calls square(6)

    // Call with a pointer to a Maths object
    Maths* maths_ptr = &maths;
    // Read as maths_ptr->function(6) where
    // "function" is "deference ptr".
    (maths_ptr->*ptr2)(6); // Calls square(6)
}

Going through some of this in order:

  • MemFuncPtr is an alias for member function pointers with the signature void (Maths::*)(int).
  • We create an instance maths of the class Maths and take its address in a pointer.
  • We declare a member function pointer as take the address of Maths::cube. Note that the & is always required when taking the address of member functions, which is different from non-member functions.
  • Next comes one of the two least well understood of all C++ operators: .*. It really isn’t all that bad, but could stand some explanation:
    • Remember when I said earlier that we need a way to pass the this pointer to the imaginary function implementation we are calling? That is precisely what this operator does.
    • I find it helpful to split the operator into two parts in my mind: the star just means “dereference the function pointer to give us a function”. This is analogous to the way dereferencing a data pointer gives us data. [Actually a reference to data, but ignore that.]
    • And then the other part of the operator, the dot, has exactly the same meaning it would have if we were calling a member function directly rather than through a function pointer.
    • The parentheses around (maths.*ptr) are required for reasons of operator precedence. This is unfortunate, as it makes the function call look weird and ugly. Shame, but that’s how it is.
  • Later we take the address of the object maths and then use the data pointer maths_ptr to invoke a member function pointer. This uses the other scary member access operator: ->*. As before, it may be simpler to interpret this in two parts. As before, the parentheses are required.

Note that like regular function pointers, member function pointers have distinct types (including the name of the class), and cannot to assigned to pointers of other types (except that member of derived classes can be used with pointers to members of base classes). I’m not sure I would even cast them: what meaning would the result have?

And that’s it. Member function pointers are actually kind of cool, and not nearly as arcane as is commonly believed. There are some complexities in their implementation for virtual functions and in the presence of multiple inheritance, but this has no impact on how they are used in your code.

Interfacing member functions with C

Sadly it is not possible to pass member function pointers to C APIs. C simply has no concept of what they are, and can only handle pointers to free functions (and static member functions). There are a few simple things we can do interface our C++ with C.

C to member function callbacks – version 1

So… you are writing your application in C++ but have to pass a callback to some C API. You would really like it if it would call a member function of a particular object. There is a simple way to do this: create an intermediate function to act as a trampoline.

using Callback = void (*)(int x, void* context);
void c_api_set_callback(Callback cb, void* context);

struct Foo
{
    Foo();
    void alpha(int x);
    void beta(int x);
    void gamma(int x);
    void delta(int x);
};

// Trampoline signature matches what the C expects.
void call_foo_alpha(int x, void* context)
{
    // We know this is a safe cast.
    // Translate into a member function call.
    auto foo = static_cast<Foo*>(context);
    foo->alpha(x);
}

// Constructor sets up the callback.
Foo::Foo()
{
    // Pass pointer to this object as context.
    c_api_set_callback(call_foo_alpha, this);
}

It is very simple and it works very well indeed. A problem with this approach is that we would need to write individual trampoline functions for each member we wanted to use a callback function, even though they all have the same signature. This is not ideal.

Note also that the “trampoline” function needs to be able to see the members of the class it needs to use. They must be public, or it must be a friend. Alternatively, the trampoline could be a static member function of the class.

C to member function callbacks – version 2

There is a slightly better way, which makes use of a member function pointer (that’s why we’re here, after all) and allows us to write only one trampoline function for all the matching members.

The thing about member function pointers is that they are known at compile time. The type is known at compile time, obviously, but so are the specific values (however they might be implemented) corresponding to each particular member function. This means that we can use a member function pointer as a non-type template parameter. Like this:

// Alias for a member function pointer.
using MemFuncPtr = void (Foo::*)(int);

// The trampoline is now a template parameterised on 
// a particular member function pointer.
template <MemFuncPtr mem_func_ptr>
void call_foo_member(int x, void* context)
{
    // And use the member function pointer.
    auto foo = static_cast<Foo*>(context);
    (foo->*mem_func_ptr)(x);
}

Foo::Foo()
{
    // Instantiate the template for several different members.
    c_api_set_callback(call_foo_member<&Foo::alpha>, this);
    c_api_set_callback(call_foo_member<&Foo::beta>, this);
    c_api_set_callback(call_foo_member<&Foo::gamma>, this);
    c_api_set_callback(call_foo_member<&Foo::delta>, this);
}

This is great, but remember that templates are code generators. By instantiating the template four times in Foo::Foo(), we generated four different versions of the code, which all differ just by which member function they call (you can verify this by copying the code into Compiler Explorer). Four virtually identical functions! Is it bloat? Well, the functions are very small. Is it justified? There isn’t really another way, and the template saved us from some repetitive and error prone typing. As always with templates, if you actually need the code they generate, you need it. You would have written it by hand anyway. Or you might find another way to get the same result.

C to member function callbacks – version 3

The little template trampoline is nice, but we can do a little better still. At the moment the trampoline is only useful for members of Foo. It would be really useful if we had a single implementation which could always be used, regardless of which class the target member function belonged to. I suspect you know what’s coming: we can convert Foo itself into a template parameter. Like this:

// Template the member function pointer type alias.
template <typename Class>
using MemFuncPtr = void (Class::*)(int);

// Add Class to the template trampoline.
template <typename Class, MemFuncPtr<Class> mem_func_ptr>
void call_member(int x, void* context)
{
    auto obj = static_cast<Class*>(context);
    (obj->*mem_func_ptr)(x);
}

Foo::Foo()
{
    // Pass pointer to this object as context.
    c_api_set_callback(call_member<Foo, &Foo::alpha>, this);
}

struct Bar
{
    Bar();
    void something(int x);
};

Bar::Bar()
{
    // Assume Bar::something has the same signature.
    c_api_set_callback(call_member<Bar, &Bar::something>, this);
}

I’m quite pleased with that. Now we have a simple bit of template goodness which we can use every time we need to convert some member function with signature void (Class::*)(int) into a free function with signature void (*)(int, void*) which can be used as a callback with C libraries. It’s just a few lines of code but can do so much for us. We could however get it to do even more for us… We could generalise the return type and the number and types of the arguments. I’ll leave these for another day.

C to member function callbacks – version 4

Finally, the invocation of call_member<> is now quite long, and includes the name of the class twice: once by itself, and once as part of the member function pointer used as a non-type template parameter. We can shorten this a little by using an auto template parameter for the member function pointer. The following code adds a thin wrapper for the c_api_set_callback() function.

// C-style API taking a plain function pointer and 
// void pointer context
using Callback = void (*)(int x, void* context);
void c_api_set_callback(Callback cb, void* context);

// Template the member function pointer type alias.
template <typename Class>
using MemFuncPtr = void (Class::*)(int);

// Trampoline to redirect C callback to a member function
template <typename Class, MemFuncPtr<Class> mem_func_ptr>
void call_member(int x, void* context)
{
    auto obj = static_cast<Class*>(context);
    (obj->*mem_func_ptr)(x);
}

// Wrapper for c_api_set_callback which means we don't have 
// to pass the name of the class twice.
template <auto MemFuncPtr, typename Class>
void new_c_api_set_callback(Class* context)
{
    // This is the code we called directly in Foo::Foo() before.
    c_api_set_callback(call_member<Class, MemFuncPtr>, context);
}

struct Foo
{
    Foo();
    void alpha(int x);
    void beta(int x);
    void gamma(int x);
    void delta(int x);
};

Foo::Foo()
{
    // Now we just pass the the member function pointer and
    // write the name of the class only once.
    new_c_api_set_callback<&Foo::alpha>(this);
    new_c_api_set_callback<&Foo::beta>(this);
    new_c_api_set_callback<&Foo::gamma>(this);
}

The trick here is that the new template function takes a Class* argument instead of a void* argument. This is sufficient for the compiler to work out the type of class for which MemFuncPtr is a member function pointer. This happens at the point where we actually call the template function in Foo::Foo(). This all adds up to needing to pass one less template argument when setting the callback function. We don’t have to duplicate the name of the class, which is convenient. A call to new_c_api_set_callback() is now basically the same amount of typing as a call to c_api_set_callback() would be with a free function. I don’t regard brevity as a particularly important goal, but do value clarity. It’s hard not to like the simplicity of using this template.

Of course, using this method also adds up to generating a distinct version of new_c_api_set_callback() each time the function template is used. Is it a price worth paying? Once again, they are very small functions. In fact, in this example they optimised away completely. The compiler can do this easily because there are no tricksy function pointers which can’t be resolved. The template function is little more than a typesafe macro.

Conclusion

Pointers to member functions are a fascinating topic, and they are quite useful in some situations. I hope that this article managed to shed a little light on them and maybe demonstrate that they are not one of the dark arts, but actually relatively simple to understand.

It is common, especially in embedded code, to set callback functions using existing C APIs. I showed a simple way to interface your C++ classes with C callbacks, and then progressively generalised that method to make it more useful while avoiding the need to write a lot of repeated code.

In the next article, I will take some of these ideas and generalise the concept of callback functions to C++.

Function pointers

Function pointers are a commonly used feature of both C and C++, though perhaps to a lesser extent in C++ because some of the use cases are obviated by other language features. They seem to be the cause of some confusion, especially among beginners, so I thought it might be useful to go through a detailed introduction. If you are already comfortable with function pointers, I doubt you will learn much here, though I’d be interested in feedback on how to improve the content. Or perhaps have a look at member function pointers.

The following code is a simple example of using function pointers to get the ball rolling:

// Declare a function pointer **type**, FuncPtr, with
// signature int(*)(int).
using FuncPtr = int (*)(int);

// Define some simple functions.
int square(int x) { return x * x; }
int cube(int x)   { return x * x * x; }
int fourth(int x) { return x * x * x * x; }

int test(int x, int power)
{
    // Declare a function pointer **variable**, func_ptr.
    FuncPtr func_ptr;
    
    // Take the address of one of the functions.
    switch (power)
    {
        case 3:  func_ptr = cube;   break; 
        case 4:  func_ptr = fourth; break; 
        default: func_ptr = square;  
    }    

    // Call whichever function was selected.
    return func_ptr(x);
}

Aside: Although it may seem odd at first, functions have addresses. The compiled code is just a bunch of bytes, and must live somewhere in the software’s address space. Calling the function amounts to setting the processor’s program counter to this address. There is more to do in order to pass arguments, save the current processor state, and the like, but that’s the core idea.

Functions have types

To understand function pointers, I think it may be useful to consider function types a little. Every function has a function type associated with it which is derived from the return type and the argument types. For example, a function whose prototype is void foo(int x, double y) has a type of void(int,double) – you literally just remove the function name and argument names.

  • The function void bar(int alpha, double beta) has the same type as foo.
  • The member function void Class::member(int a, double b) has the same type.
  • The function int baz(int x, int y, int z) has a different type, which is int(int,int,int).

A function type is somewhat like a data type such as int, but you can’t do a lot with it. You can’t even declare variables of a function type. What would they even be? I guess they’d be whole functions. Although I have never seen this in practice, you can create an alias for a function type and use it to declare functions. Like this:

// Alias for a function type
using FuncType = void(int, double);

// Equivalent to void gamma(int, double); 
FuncType gamma; 

// Equivalent to void delta(int, double); 
FuncType delta;

struct Thing
{
    // Equivalent to void member(int, double); 
    FuncType member;
};

The alias can be used to declare functions but not to define them. So to implement Thing::member you need to write this:

void Thing::member(int, double)
{
}

This is quite an obscure corner of the language which I’m only including for background. I would definitely avoid doing anything directly with function types like the examples just given: you will just confuse everyone. Note that my description of function types is very informal compared to the C and C++ standards. You should check those for full details. But I think I’ve covered enough for our purposes.

Function pointers = pointers to function types

I said above that you can’t declare variables of function type. However, what you can do is declare variables whose type is pointer to function type: these are function pointers. In much the same way as data pointers hold the addresses of data objects, function pointers hold the addresses of functions. Function pointers are much more interesting and useful than function types.

Sadly, declaring function pointers is not pretty. You already know this, but to declare a variable of type pointer to int or pointer to double, you just do this:

// x is a variable of type pointer to int
int* x;    

// y is a variable of type pointer to double
double* y; 

Simple enough. To declare a variable of a pointer to function type such as void(int,double) you have to use some quite ugly and confusing syntax:

// fizz is a variable of type 
// pointer to void(int,double)
void (*fizz)(int, double);

// buzz is a variable of type 
// pointer to int(int,int,int)
int (*buzz)(int, int, int);

Both fizz and buzz are function pointers. The names of these variables are mixed into the signatures when declaring them. This is often a source of great confusion. I think this might be my candidate for the ugliest syntax in all of C and C++. We will shortly simplify these declarations…

Even though both fizz and buzz are function pointers, it is important to understand that they have completely distinct data types. They are as distinct as x and y above, an int* and a double*, respectively.

  • Function pointers are data types. They may be null, or may hold the address of a function which has a matching signature (i.e. a matching function type).
  • Function pointers can be used in all the same places as other data types: as local variables, as members of structs and classes, in arrays, in STL containers, allocated on the heap (I’ve never seen this), as non-type template parameters, and so on.
  • Function pointers can also be used in expressions with comparison and boolean operators but, unlike data pointers, arithmetic and bitwise operators make no sense and are illegal in C++.
  • A function pointer type itself can also be used as a template parameter.

So that has hopefully cleared up what function pointers are. The basic syntax for declaring them is just horrible. So…

Aliases for function pointer types

This section looks at how to use aliases to make function pointer declarations look just like other variable declarations. There a few common ways to alias a function pointer type.

Typedef

The first is inherited from C and is something you should become familiar with even though I don’t recommend using it in your own code. You will see it a lot in older C++ code, and it is the only method available in C. This method creates an alias with typedef. The syntax is exactly the same as declaring a variable above, but with the keyword typedef in front:

// Old style function pointer **type** - avoid.
typedef void (*OldFuncPtr)(int, double);

// ptr is a function pointer.
OldFuncPtr ptr; 

// vec is a vector of function pointers.
std::vector<OldFuncPtr> vec;

// Thing::m_ptr is a function pointer.
// (*NOT* a pointer to a member function, but
// a plain function pointer which is a data 
// member.)
class Thing
{
    OldFuncPtr m_ptr;
};

// func is a non-type template parameter for 
// class Something.
template <OldFuncPtr func>
class Something {};

The code includes a few use cases for OldFuncPtr to demonstrate (I hope) that using the alias is simpler than using the raw function pointer syntax. We could, for example, have defined vec as follows, which is equivalent. You will sometimes see code like this, but I know which I prefer:

// vec is a vector of function pointers.
std::vector<void(*)(int, double)> vec;

Type alias

A typedef gives a more user friendly name to a function pointer type, but still uses the old C syntax when creating the alias. C++11 introduced a better type alias with the using keyword:

// New style function pointer type - preferred.
using NewFuncPtr = void (*)(int, double); 

This pulls the alias name of the function pointer type out from the middle of the signature and is, I think, a lot more readable. The code basically just says “this simple name is equivalent to this ugly thing”. NewFuncPtr is identical to OldFuncPtr and they can be used interchangeably. The parentheses around the star are required because of operator precedence – to distinguish our function pointer from a function type which returns void* in this example. I generally just read “(*)” as “pointer to function”.

Something to note is that type aliases can be templates. Like this:

template <typename T>
using FuncPtrT = void (*)(T, T);

// fi is a function pointer of type
// void (*)(int, int).
FuncPtrT<int> fi; 

// fd is a function pointer of type
// void (*)(double, double).
FuncPtrT<double> fd; 

This can be very useful sometimes. For completeness, if we wish, we can also alias this template for particular template parameters:

using IntFuncPtr = FuncPtrT<int>;

// fi2 is a function pointer of type
// void (*)(int, int).
IntFuncPtr fi2; 

Trailing return type

C++ introduced a variation on how to declare a function. The return type can come at the end instead of at the beginning. This is useful/essential when the return type is not known before the argument types are known to the compiler, which can happen in some templates. The trailing syntax looks like this:

// New style function pointer type - trailing return.
using TrailingFuncPtr = auto (*)(int, double) -> void; 

TrailingFuncPtr is identical to OldFuncPtr and NewFuncPtr, and they can be used interchangeably. auto is required at the start as a stand-in for the return type, and the actual return type comes at end after an arrow ->.

Some people now use the trailing syntax everywhere in their code for “consistency”. Personally, I don’t recommend this. I’m all for modernising code, but this seems to be gratuitously throwing away forty years of familiarity with billions of lines of code for zero gain. Change for change’s sake is not a good thing.

On the other hand, the trailing syntax does have a nice feature in this particular context. You can read the function pointer meaning directly from left to right: “TrailingFuncPtr is a pointer to a function which takes two arguments with types int and double, respectively, and returns void“. That’s much simpler than the C syntax, which bounces from going left to right to left again.

Using a function pointer alias

I have never seen this used anywhere, but we could theoretically declare an alias for a function type rather than a pointer to function type, and then declare pointers of that type. Like this:

// Alias for a function type
using FuncType = void(int, double);

// This is a pointer to functions with type
// void(int,double).
FuncType* func_ptr;

// Alias for a function pointer type
using FuncTypePtr = FuncType*;

// This is a pointer to functions with type
// void(int,double).
FuncTypePtr func_ptr2;

In this example, func_ptr and func_ptr2 have exactly the same data type. Given its apparent rarity, I’d probably avoid aliasing function types directly.

Using a function pointer variable

So far we have covered what function pointers are and how to alias them with simple names to hide the rather unpleasant syntax, but we haven’t actually used them for any function pointing. This is very simple: you just assign the function pointer with (the name of) a function, or its address, or the value of another function pointer, and then you use the function pointer to call the function of whose address it currently holds:

void cube(int x)
{
    std::cout << x << " cubed is " << (x*x*x);
}

void square(int x)
{
    std::cout << x << " squared is " << (x*x);
}

// Create an alias for the function pointer.
using FuncPtr = void (*)(int);
using DiffPtr = void (*)(int, int);

int main()
{
    FuncPtr ptr = &cube;  
    ptr(3); // Calls cube(3) 

    ptr = nullptr;
    ptr(3); // Nothing to call - crash

    ptr = square; // & optional 
    ptr(5); // Calls square(5)

    FuncPtr ptr2 = ptr; 
    ptr2(6); // Calls square(6)

    DiffPtr diff; // Uninitialised
    diff = cube;  // Doesn't compile
    diff = &cube; // Doesn't compile
    diff = ptr;   // Doesn't compile
}

Key things to note from the code in main():

  • Function pointers can be null or uninitialised.
  • Function pointers can be assigned and re-assigned
    • They can be directly assigned with the name of a function. This is analogous to assigning an int variable with an immediate value: “int x; … x = 42;”.
      • One of the curiosities of C (which C++ inherited) is that you can assign the function name directly, or take the address of the name with &. These are equivalent, and the compiler takes the address in both cases. I guess a function has no value that can be returned, so it makes sense to return its address instead.
    • They can be assigned to/from other function pointer variables. This is analogous to assigning an int variable with the value of another int variable: “int x, y; … x = y;”.
      • Note that using & on another function pointer will not work: this returns a pointer to a pointer to a function, which makes sense. Fear not – your code will not compile if you do this.
    • Function pointers cannot be assigned with a function name (or function pointer) which does not have the same signature. This is analogous to not being able to assign a variable of type Foo to a variable of type Bar (operator overloading notwithstanding).
      • This makes sense if you think about it. If the function at some address expects two integers arguments to be passed to it then it will almost certainly break if you call it through a function pointer which passes a double or something. Arguments are placed on the stack before jumping to the function’s address: the function knows how to unpack its arguments from the stack. If something else is there instead, there will be trouble.
  • Function pointers are used with exactly the same function call syntax as a regular function with the signature they represent.
    • You can think of () as an operator which, when applied to a function pointer, calls the function which is being pointed at.
    • In fact, it turns out that () actually is an operator, and one that classes can overload with custom implementations. This is for another day.

Function pointers v data pointers

When you think about it, a function pointer seems a lot like a data pointer.

  • Data resides in memory at some particular location in the software’s address space, and can be accessed by de-referencing a data pointer which holds that location (that is, the address of the data). The address of the data is the value of the data pointer.
  • By the same token, executable code resides in memory at some particular location in the software’s address space, and can be executed by placing that location (that is, the address of the function) into the processor’s program counter. The address of the function is the value of the function pointer.

It’s a nice analogy, but it’s a mirage. Data pointers and function pointers are not really the same thing, and they are not at all interchangeable. They are not even necessarily the same size (on Cortex-M, both are four bytes). Depending on the architecture, functions may not even reside in the same address space as data. That is, you could theoretically have both data with address 0x00020000 and a function with address 0x00020000, but with no overlap because they are in independent address spaces (for example, Harvard architecture devices).

So if you ever find yourself casting a data pointer into a function pointer, or vice versa: your code is probably broken. It might work if you call the function, depending on the architecture and the implementation, and/or it might start a game of Global Thermonuclear War. I’m no language lawyer, but it is also questionable whether casting between function pointers with different signatures is well-defined. Remember: casting is lying to the compiler. Or, if you prefer, it is saying to the compiler “Look away! I totally know what I’m doing”. Which is very often the same thing in my experience.

But still, when it comes right down to it, they are both just data types whose variables hold addresses, which are just numbers. Function pointers are not mysterious black magic, and that’s good to know.

Why are function pointers useful?

We’ve covered what function pointers are and how to use them a little bit, so now let’s finally look at some ways they are used in practice. Function pointers offer a simple indirection mechanism that can be used in many different ways to call a function whose identity is not necessarily known at compile time. We already saw an example of this in the function test() at the beginning of this article. The following sections demonstrate a few more examples.

Lookup table

Lookup tables are very common. One example might be in the implementation of a finite state machine. If all the action functions have the same signature, they could be stored in a table which is indexed by the current state and the current event. Or we could have something as simple as this:

// Declare a function pointer type.
using Func = int (*)(int);

// Define some simple functions.
int square(int x) { return x * x; }
int cube(int x)   { return x * x * x; }
int fourth(int x) { return x * x * x * x; }

// A lookup table of function pointers.
Func g_table[] = { square, cube, fourth };

int test(int x, int index)
{
    assert(index < 3);
    return g_table[index](x);
}

Virtual functions are typically implemented internally using a table of function pointers: the vtable. The compiler does not know which implementation of a given virtual function you will call at run time, but it does know the index of that function in the vtable. So calling a virtual function involves a simple lookup in a table with a compile time constant index to find a function pointer. A lot of people seem to think virtual functions are very expensive. This is not directly true, but they can have an impact on cache hits, and cannot easily be optimised away. But if you actually need run time polymorphism, it could hardly be more efficient. [Note that I am glossing over the fact that virtual functions are member functions, and that pointers to member functions are not at all the same creatures as pointers to free functions. The compiler knows its business.]

Interface structure

I’ve often seen structures containing a set of function pointers which collectively provide an interface to some behaviour. This can be used in C to sort of modularise code. The client of the interface specifies what the functions do, but has no idea how they are implemented. An instance of the structure is populated somewhere during initialisation, and may even change during program execution. You almost never see code like this in C++ because this is a perfect use case for virtual functions in abstract base classes.

// Create instance of structure and populate somewhere.
struct Funcs
{
    // Direct signature usage to avoid three typedefs - lazy!
    void (*fiddle)(int);
    void (*faddle)(int, int);
    void (*muddle)(int, int, int);
    // ...
};

// Funcs likely passed as a pointer in C.
void test(int x, const Funcs& funcs)
{
    // Assume function pointers all non-null.
    funcs.fiddle(x);
    funcs.faddle(x, x);
    funcs.muddle(x, x, x);
}

Speaking of virtual functions, although they are always described as having function pointers in a “table”, the functions do not all have the same signature. I wonder if the reality is in principle more like the interface structure in the example here. But it really makes no difference since this is just a detail of the implementation. Function pointers are just numbers, and the implementation knows which function pointer type corresponds to each index in the table. I think it is safe to assume that the compiler knows how to make virtual functions work properly.

Callback functions

One of the most common idioms that uses function pointers is callback functions, so I’ll cover them in some detail.

Imagine that you are using a library which needs to notify your code when certain things happen. Perhaps it internally implements an interrupt handler for some GPIO input and your application needs to know when an interrupt occurs. You cannot easily replace the interrupt service routine with your own, but the library lets you set a callback function. It might be implemented like this:

// Library code 
// ------------
// In the public header file
using IntHandler = void (*)(bool state); 
void set_interrupt_handler(IntHandler handler);

// In the private implementation file
IntHandler g_handler = nullptr; 
void set_interrupt_handler(IntHandler handler)
{
    g_handler = handler;
}

extern "C" void GPIO_IRQ_Handler()
{
    // Clear interrupt flags
    // ...

    // Read state of the pin
    bool state;
    // ...
    
    // Call the callback function
    if (g_handler)
    {
        g_handler(state);
    }
}

The library declares a function pointer type/alias and a function you can call in order to pass it the address of a matching function, which it will store in g_handler. You can now use this feature in your own code something like this:

// Your application code
// ---------------------
// This is the callback function. The 
// signature matches IntHandler.
void on_interrupt(bool state)
{
    // Do something here
}

int main()
{
    // Configure the pin using the library.
    // Pass the callback function to the library.
    // ...
    set_interrupt_handler(on_interrupt); 

    // ...    
}

And now on_interrupt() is called by the library whenever an interrupt occurs. From the application’s point of view, it is trivial to implement. From the library’s point of view, it’s a bit more effort, but not much.

Using function pointers like this is a staple of C interfaces. As with the other examples, callbacks are used all over the place.

Passing context information to callback functions

It is very common in such C interfaces to pass a void* value along with the function pointer. The value of the void* is meaningless to the library but can be used to give your own code some context when the callback is actually called. You might, for example, pass the address of a struct which contains some state which is useful in your application. This idiom is so common in C code that you are certain to encounter it. So I’ll include the full code again. I’ll use C-style declarations this time:

// Library code 
// ------------
// In the public header file - argument names optional but helpful
typedef void (*IntHandler)(bool state, void* context); 
void set_interrupt_handler(IntHandler handler, void* context);

// Or, if the C dev hates other people they might not bother 
// with the typedef at all. Don't do this in your code. I mean, 
// seriously, writing code like this is the very definition of 
// misanthropic:
void set_interrupt_handler(void (*handler)(bool, void*), void* context);

// In the private implementation file
// These two values might be placed in a struct to associate them.
IntHandler g_handler = nullptr; 
void*      g_context = nullptr;

// Or, if the C dev hates other people ...
void (*g_handler)(bool state, void* context); 

void set_interrupt_handler(IntHandler handler, void* context)
{
    g_handler = handler;
    g_context = context;
}

extern "C" void GPIO_IRQ_Handler()
{
    bool state;
    if (g_handler)
    {
        // Call the callback function, passing context.   
        g_handler(state, g_context);
    }
}

And now for your application code:

// Your application code
// ---------------------
struct SomeInfo { /* ... */ };

void on_interrupt(bool state, void* context)
{
    // Do something here with the context
    SomeInfo* info = static_cast<SomeInfo*>(context);
    // ...
}

SomeInfo g_interrupt_info; 

int main()
{
    // Configure the pin using the library.
    // Pass the callback function and context to the library.
    // ...
    set_interrupt_handler(on_interrupt, &g_interrupt_info);     

    // ...    
}

This is not much harder to use than the version without context, but gives your code more scope for disambiguation if the same callback is used more than once.

Conclusion

I hope I have managed to shed at least some light on the nature and uses of function pointers. For me, the notion that function signatures themselves are just simple data types was not made clear early on, and I think this would have been helpful to understand.

Although they are useful and interesting, and kind of essential reading in the embedded domain, function pointers are really a very C way of doing things. This is not entirely the case, but C++ offers other abstractions which are often simpler to use. Virtual functions are a good example: one of the main uses for function pointers in C is to create run-time polymorphism which is almost indistinguishable from virtual functions. The built-in C++ version is efficient, and offers the compiler better opportunities for optimisation. Callback functions are still very important, especially if you have to integrate your application with C, which is usually the case in the embedded world.

The next article will move on to the generalisation of function pointers in C++: member function pointers. These are way more fun!

Traits for STM32 pin functions

This article presents another example of using trait templates to help us not shoot ourselves in the foot with invalid pin assignments. If this means nothing to you please take a look Traits templates 101. If it all looks spookily familiar, perhaps you already read Traits for wake-up pins. The comments below do present essentially the same idea as the article on wake-up pins, but a little repetition of good news never hurt anyone. And, in addition, I consider an alternative using constexpr functions.

Alternate pin functions

You may or may not have used STM32 microprocessors (FYI: they’re grrrrreat!). All that matters here is to understand that, like other device families, they have many GPIO pins and many hardware peripherals: USARTs, SPIs, I2Cs, ADCs, DACs, timers, and so on. Each pin can be used directly for digital input and output, or it can be configured to perform a particular “alternate” function for a particular peripheral, such as the RX function for a USART, MISO for a SPI peripheral, Input/Output Channel 1 for a timer, or whatever.

STMicroelectronics Discovery Board (STM32F407VG)

Now it turns out that in the STM32 hardware design, only certain specific pins can be used for certain specific functions for certain specific peripherals. There were presumably compromises in the multiplexing in the silicon. Each pin can typically be used for one of several different functions – up to fifteen. That is, the RX function for USART2 can only be performed by one of three (say) particular pins; and pin PA3 can be used to perform one of several alternate functions, of which USART2 RX is but one.

The datasheets for STM32 parts contain large lookup tables detailing every possible alternate function for every pin on the device. Not all variants of the devices have the exactly the same lookup tables (they tend to be subsets of some super-table). When you are configuring a USART, say, you need to be careful to select appropriate pins for TX and RX (and CTS and RTS, if you use control flow). You also need to be careful to configure the correct alternate function index in each case. It would be very easy to make a mistake. I’ve made such mistakes once or twice. It was annoying and embarrassing.

Traits and alternate pin functions

Suppose I have a class USARTDriver which configures and uses a USART/UART peripheral on the STM32. We won’t care about it’s implementation, except that it is agnostic about the particular peripheral and pins it uses. We pass this information to the constructor with an instance of the following structure, which we create at compile time. I’ve omitted the baud rate and other stuff that could be included:

struct USARTConfig
{
    Periph usart;

    Port  tx_port;
    Pin   tx_pin;
    AltFn tx_alt_fn;

    Port  rx_port;
    Pin   rx_pin;
    AltFn rx_alt_fn;
};

  • Periph is an enumeration of all the peripherals on the device. Or we could use the base address for each peripheral’s memory mapped registers. Or something else.
  • Port is an enumeration of all GPIO ports on the device: PA – PK . Or we could use the base address for each port’s memory mapped registers. [STM32 ports are effectively just another kind of peripheral – one with sixteen sub-peripherals (pins) which share the same set of registers.]
  • Pin is an enumeration of possible pin values: P0 – P15 . Or we could just use an integer.
  • AltFn is an enumeration of possible alternate function indices: AF0 – AF15. Or we could just use as integer.
enum class Periph : uint8_t 
    { Usart2, Usart3, Tim2, Tim5, Tim9, /*...*/  };
enum class Port : uint8_t 
    { PA, PB, PC, /*...*/ };
enum class Pin : uint8_t 
    { P0, P1, P2, P3, /*...*/ };
enum class AltFn : uint8_t 
    { AF0, AF1, AF2, AF3, /*...*/ AF7, /*...*/ };

I’ve used my own enumerations in this example in order to completely isolate the vendor library code from the rest of the application. I’ve made them uint8_t explicitly so the compiler doesn’t eat up flash space with zeroes (the default underlying type for scoped enums is int). The driver implementation itself may or may not use the vendor library, but the rest of the application should neither know nor care.

Let’s create a trait class to capture the information about which pins can perform which alternate functions.

Primary template

As with the wake-up pins, we want a primary template for the general case:

enum class PinFn : uint8_t  
    { RX, TX, CH1, CH2, CH3, CH4, MOSI, /*...*/ };

template <Port PORT, Pin PIN, Periph PERIPH, 
    PinFn FUNC, bool DUMMY = false>
struct AltFnMap
{
    static_assert(DUMMY, "This alternate function "
        "combination does not exist.");
};

There is a new enumeration PinFn of all the different kinds of functions that pins can perform, and a template AltFnMap which is parametrised on a particular port, pin, peripheral and pin function combination. Note that the base template will force a compilation error with a hopefully helpful error message. This is slightly better than simply not compiling.

Template specialisations

To capture the lookup table, we simply need to create a template specialisation for each valid combination. This is potentially large, but has no impact at run time. Each specialisation gives us the relevant alternate function index.

// PA2
template <> struct AltFnMap<Port::PA, Pin::P2, Periph::Tim2, PinFn::CH3> 
{ static constexpr AltFn alt_fn = AltFn::AF1; };
template <> struct AltFnMap<Port::PA, Pin::P2, Periph::Tim5, PinFn::CH3> 
{ static constexpr AltFn alt_fn = AltFn::AF2; };
template <> struct AltFnMap<Port::PA, Pin::P2, Periph::Tim9, PinFn::CH1> 
{ static constexpr AltFn alt_fn = AltFn::AF3; };
template <> struct AltFnMap<Port::PA, Pin::P2, Periph::Usart2, PinFn::TX> 
{ static constexpr AltFn alt_fn = AltFn::AF7; };
// ...

// PA3
template <> struct AltFnMap<Port::PA, Pin::P3, Periph::Tim2, PinFn::CH4> 
{ static constexpr AltFn alt_fn = AltFn::AF1; };
template <> struct AltFnMap<Port::PA, Pin::P3, Periph::Tim5, PinFn::CH4> 
{ static constexpr AltFn alt_fn = AltFn::AF2; };
template <> struct AltFnMap<Port::PA, Pin::P3, Periph::Tim9, PinFn::CH2> 
{ static constexpr AltFn alt_fn = AltFn::AF3; };
template <> struct AltFnMap<Port::PA, Pin::P3, Periph::Usart2, PinFn::RX> 
{ static constexpr AltFn alt_fn = AltFn::AF7; };
// ...

And that’s our trait template done. Creating this table would seem to be quite a bit piece of work, and quite error prone, but in theory it would only have to be done once. More later…

Example usage

Using the trait templates to test pin selections – part 1

Now we could (but won’t) create the configuration for my driver instance as follows:

static constexpr Periph DBG_PERIPH  = Periph::Usart2; 
static constexpr Port   DBG_TX_PORT = Port::PA; 
static constexpr Pin    DBG_TX_PIN  = Pin::P2; 
static constexpr Port   DBG_RX_PORT = Port::PA; 
static constexpr Pin    DBG_RX_PIN  = Pin::P3; 

static constexpr USARTConfig debug_usart_conf = 
{
    DBG_PERIPH,

    DBG_TX_PORT,
    DBG_TX_PIN,
    AltFnMap<DBG_TX_PORT, DBG_TX_PIN, DBG_PERIPH, 
        PinFn::TX>::alt_fn,

    DBG_RX_PORT,
    DBG_RX_PIN,
    AltFnMap<DBG_RX_PORT, DBG_RX_PIN, DBG_PERIPH, 
        PinFn::RX>::alt_fn,
};

This looks a bit wordy, but all it does it define some constants. The constants for the peripheral, TX port and so on are really just there to avoid duplication in the definition of debug_usart_conf. You could imagine that these constants are actually created elsewhere, in some global system configuration file. The key thing to note is that if your selected combination of port, pin and peripheral cannot provide either the TX or RX functions, your code will not compile. That’s a nice outcome for not too much work.

It is worth messing about with code like this in the incomparable Compiler Explorer. Even with optimisation turned off, you will see lovely definitions of constant structures and no executable instructions at all except whatever else you put in for your example. But just change PA3 to PA2 and…

<source>: In instantiation of 'struct AltFnMap<(Port)0, (Pin)2, (Periph)0, (PinFn)0, false>':
<source>:104:5:   required from 'constexpr USARTConfig make_config() 
    [with Periph PERIPH = (Periph)0; Port TX_PORT = (Port)0; Pin TX_PIN = (Pin)2; 
     Port RX_PORT = (Port)0; Pin RX_PIN = (Pin)2]'
<source>:109:33:   required from here
<source>:67:19: error: static assertion failed: This alternate function combination does not exist.

That’s clear enough (ARM64 gcc 8.2).

Using the trait templates to test pin selections – part 2

So far, so good, but the way we initialise debug_usart_conf is still prone to error.

  1. We might makes a mistake with the various duplications of names.
  2. We might enter the wrong values from PinFn: TX when you meant RX.
  3. We might forget to use AltFnMap altogether: Just enter AltFn::AF12 directly…

We can easily fix all this by creating another simple template which does all the tedious legwork for us. Like so:

template <Periph PERIPH, Port TX_PORT, Pin TX_PIN, 
    Port RX_PORT, Pin RX_PIN>
struct USARTConfigT : USARTConfig
{
    constexpr USARTConfigT() : USARTConfig 
    { 
        PERIPH, 
        TX_PORT, TX_PIN, AltFnMap<TX_PORT, TX_PIN, 
            PERIPH, PinFn::TX>::alt_fn, 
        RX_PORT, RX_PIN, AltFnMap<RX_PORT, RX_PIN, 
            PERIPH, PinFn::RX>::alt_fn 
    } 
    {}
};

And now creating the configuration structure looks like the following snippet if all the hardware choices are hard-coded at this location in the code. It could hardly be more concise:

static constexpr USARTConfigT<Periph::Usart2, 
    Port::PA, Pin::P2, Port::PA, Pin::P3> 
    debug_usart_conf;

I’m not normally one to ask for changes to the language, but something that might be nice here, at least as a form of documentation, is named template arguments. Or you could just use named constants as before:

static constexpr USARTConfigT<DBG_PERIPH, 
    DBG_TX_PORT, DBG_TX_PIN, DBG_RX_PORT, DBG_RX_PIN> 
    debug_usart_conf;

Using the trait templates to test pin selections – part 3

Although I am perfectly happy that the inheritance usage above works just fine, maybe not everyone agrees. It is theoretically slicing the derived struct when the configuration is passed to the driver’s constructor (though there’s nothing to lose). It also puts all the arguments before the name of the defined constant structure, which may or may not be to your taste – it’s not exactly uniform initialisation.

There is at least one other way to get what we want, which you may prefer – use a template constexpr function instead. This simple example is possible since C++11:

template <Periph PERIPH, Port TX_PORT, Pin TX_PIN, 
    Port RX_PORT, Pin RX_PIN>
constexpr USARTConfig make_config()
{
    return
    { 
        PERIPH, 
        TX_PORT, TX_PIN, AltFnMap<TX_PORT, TX_PIN, 
            PERIPH, PinFn::TX>::alt_fn, 
        RX_PORT, RX_PIN, AltFnMap<RX_PORT, RX_PIN, 
            PERIPH, PinFn::RX>::alt_fn 
    }; 
};

And creating the driver configuration looks as follows:

static constexpr USARTConfig debug_usart_conf = 
    make_config<Periph::Usart2, Port::PA, 
    Pin::P2, Port::PA, Pin::P3>(); 

The outcome is identical. And now everyone is happy. Actually, now that I’ve typed it, I think I prefer the function. 🙂

Can we just forget all about templates?

Since I used a simple constexpr function above to invoke AltFnMap, I wondered if we could go further and avoid using templates at all. Perhaps. I must confess that I am not as familiar with the innards of constexpr functions as I should be, so this is part of my own journey…

We could in principle do something like this:

constexpr AltFn alt_fn_map(Port PORT, Pin PIN, Periph PERIPH, PinFn FUNC)
{
    using PinAltFn = std::tuple<Port, Pin, Periph, PinFn>;

    auto test = PinAltFn{PORT, PIN, PERIPH, FUNC}; 
    // PA2
    if      (test == PinAltFn{Port::PA, Pin::P2, Periph::Tim2, PinFn::CH3})  return AltFn::AF1; 
    else if (test == PinAltFn{Port::PA, Pin::P2, Periph::Tim5, PinFn::CH3})  return AltFn::AF2; 
    else if (test == PinAltFn{Port::PA, Pin::P2, Periph::Tim9, PinFn::CH1})  return AltFn::AF3; 
    else if (test == PinAltFn{Port::PA, Pin::P2, Periph::Usart2, PinFn::TX}) return AltFn::AF7; 

    // PA3
    else if (test == PinAltFn{Port::PA, Pin::P3, Periph::Tim2, PinFn::CH4})  return AltFn::AF1; 
    else if (test == PinAltFn{Port::PA, Pin::P3, Periph::Tim5, PinFn::CH4})  return AltFn::AF2; 
    else if (test == PinAltFn{Port::PA, Pin::P3, Periph::Tim9, PinFn::CH2})  return AltFn::AF3; 
    else if (test == PinAltFn{Port::PA, Pin::P3, Periph::Usart2, PinFn::RX}) return AltFn::AF7; 

    else static_assert("This alternate function combination does not exist.");

    return AltFn::AF0;
}

constexpr USARTConfig make_config(Periph PERIPH, Port TX_PORT, Pin TX_PIN, 
    Port RX_PORT, Pin RX_PIN)
{
    return
    { 
        PERIPH, 
        TX_PORT, TX_PIN, alt_fn_map(TX_PORT, TX_PIN, 
            PERIPH, PinFn::TX), 
        RX_PORT, RX_PIN, alt_fn_map(RX_PORT, RX_PIN, 
            PERIPH, PinFn::RX) 
    }; 
};

static constexpr USARTConfig debug_usart_conf = 
    make_config(Periph::Usart2, Port::PA, 
    Pin::P2, Port::PA, Pin::P2); 

It looks pretty neat and is arguable easier to understand. And perhaps the code is a little shorter than the template version, not that we should obsess over this.

I’ve used a tuple to make a sort of key out of the port, pin, peripheral and alternate function. I wondered if a std::map could be used instead of a chain of conditions, but it seems that this is not possible. There is nothing inherent in maps that make this an impossible dream, so I guess the committee didn’t think about it. Perhaps in a future standard… For now this code is OK (it’s only for compile time), or you could write a constexpr map, or there are sure to be libraries which already have one.

There is also the potential advantage of being able to use the function at runtime. This doesn’t seem useful for embedded, but never say never. That advantage is also a potential pitfall since you might not be evaluating the function at compile time as you intended – the compiler won’t say a thing (it’s not an error). I intensely dislike such hidden “gems”. I gather consteval will fix this in C++20. Why didn’t they just do that C++11? [Honestly, committee guys, I couldn’t care less about coroutines or ranges. How about C++23 focuses entirely on fixing little warts?].

But it doesn’t work!

From what I can tell (I’m using Compiler Explorer), the function works as expected when the arguments are valid, and produces identical output to the templates. But there is no compilation error when the arguments are invalid. It somehow completely ignores the static assertion. This isn’t what I wanted. I understand (I think): alt_fn_map() is a plain old function that just happens to be executable at compile time if conditions are right. There are no template parameters to statically check.

I felt sure that there must be some super-clever trick to make this do what I want. Super-clever tricks are generally anathema to me: writing solid production code is not a parlour game of competitive arcana. However, I did find one idea so simple that it would be churlish to complain. It looks like this:

// Just a wrapper around whatever the implementation defines assert() to be.
inline void assert_helper(bool condition) 
{
    assert(condition);
}

// When false, the ternary operator uses the comma operator to evaluate to 
// false, but only after calling assert_helper(). Note that assert_helper 
// is not constexpr - compilation error! If evaluated at runtime instead, 
// assert() behaves as normal. 
inline constexpr bool constexpr_assert(bool condition) 
{
    return condition ? true : (assert(condition), false);
}

There are other more complicated offerings out there, involving lambdas and perfect forwarding, but I haven’t really understood why they might be better. We can use constexpr_assert() in place of static_assert(), and all is well.

So it does work. And that’s great. But I’m still not sure I prefer it. I’m not sufficiently confident with the dark corners and imponderables yet to be completely comfortable with this approach. What I can say for certain is that templates have been around for a very long time, and will certainly get the job done for about the same amount of effort. And if you are using an earlier standard than C++14, templates are the only game in town.

Creating the lookup tables programmatically

Although this example of using traits to identify errors at compile time is simple, the truth is that the lookup table for the entire device is large. And slightly different devices have slightly different lookup tables. Another device may or may not have UART7 among its collection of peripherals, for example, with commensurate changes throughout the datasheet. Creating and maintaining a full table of template specialisations across the whole STM32F4 family, and then tailoring the table to match the various sub-families which are slightly different from each other, … This looks to be quite a big task. It would definitely be worth having, but that’s quite a lot of typing to double check and triple check against the datasheets.

In Datasheets in databases, I waxed lyrical about the potential benefits of a vendor-provided SQL database which contains literally everything that is useful to know about a given family of microprocessors – everything that can be found in the reference manual and datasheets. This article demonstrates a perfect example of my goal: with such a database, we could very easily write a little script to generate all the template specialisations we need for our particular device. We could even do it as a build step in cases where we are targeting multiple devices with the same code base.

I know of at least one case where someone spent a lot of effort parsing the Reference Manual itself (a PDF) with a script in order to extract some useful information. I don’t know how successful that effort was, but it is not a task I fancy much. Easier to generate the PDF from a more machine readable format (hint).

Conclusion

I realise that this article mostly just repeats the message from Traits for wake-up pins, but I thought an example with a potentially very large lookup table might feel more realistic. In any case, I don’t mind repeating the fact that we can use really simple ideas like this to convert potential run time faults into compile time errors. The C++ compiler is already pretty fussy about static type checking before we do anything at all, but we can extend it with very little effort to enforce arbitrary rules of our own and make it immediately absolutely crystal clear when we have made a silly mistake. That’s worth repeating, no?

Datasheets in databases

I focus on STM32 microprocessors in the discussion below, but the ideas could apply to any range of devices.

Background

I have spent a lot of time in recent years trawling through the datasheets, reference manuals and programming manuals for STM32 microprocessors. Mostly for STM32F4s, but also for STM32F0s. These documents are incredibly useful, and contain a wealth of information which is essential for correctly configuring and using the hardware. But there is a problem.

After the umpteenth time of looking up the DMA streams assignment table, I realised that what I really wanted for Christmas was all of that information to be available to me programmatically. I’ve made a couple of different attempts to capture this kind of data in a custom hardware abstraction layer, with all kinds of lookup tables and functions to access them. [Before you say it, one of the issues was that the firmware would waste a load of space on lookup tables and whatnot that are only used during initialisation.] This all worked pretty well, though it was far from complete, and I even got a chance to demonstrate some code to ST (Ans: C++? No thanks).

Writing these experimental libraries gave me a deeper insight: that what I really, really, REALLY wanted for Christmas was all of that information to be available in an easily accessible general purpose machine readable format. This would make it much easier to write productivity tools such as peripheral configurators, pin-out designers, clock tree setter-uppers, code generators, and any number of other things.

Writing such tools is burdened by the onerous task of compiling all of the required data across a range of potentially hundreds of devices with scores of variants. Unsurprisingly, it seems that only ST have risen to the challenge, and they have given us STM32CubeMX (see below).

So I started looking around for some detailed information sources that might satisfy my desire…

Possible information sources

SVD files

As things stand, we have SVD files. From what I can tell, they give enough hierarchical information about peripherals, their registers, and their fields, and their permitted values to allows a debugger to display the current values in the registers at run time. To be fair, I haven’t really dug into this: perhaps I’ve misunderstood. This page confirms my understanding: SVD format. That page mentions something called IP-XACT but, from what I can tell, that is not what I’m looking for.

From what I’ve seen, SVD files are of varying quality. They may or may not contain information about enumerated values for fields, for example. I’ve seen SVD files for virtually identical devices which nevertheless contained some odd discrepancies in register naming. There are almost certainly errors.

They also do not contain a ton of other useful information that is found in the datasheets. It appears that SVD files are aimed at a particular application (debugging), and contain just enough information to support that goal.

While it is possible to parse an SVD file to extract useful information (I’ve done this), I’m not convinced of the quality, correctness, consistency or completeness of that information.

STM32CubeMX

ST have invested a great deal of effort in a hardware configuration tool called STM32CubeMX. It looks really good, and the GUI does a pretty nice job of helping you to configure the clock tree, pins, peripherals, DMA, interrupts and whatnot for your particular device. And it will generate the start up code for your project based on your configuration. As a former GUI designer myself, there is much about this tool that I think is excellent. It is certainly the best in this class of application that I have seen: nothing else seems to come close.

I just used it when getting this image. It’s been a while but my recollection was that the pins were easily changed to other valid options (through drop down lists). Not so, it seems. It took me ages to work out how to change from PA2 to PD5, and it wasn’t nearly as helpful as it might be – I had to hunt for the pin. Though it is a great start, I think CubeMX could be much better than it is, but that is a rant for another day.

I was curious about how the GUI knows which pins are acceptable for, say, SPI1 MOSI across hundreds of different devices. It turns out that it has a “database” of sorts. There are hundreds of XML files containing the necessary data. Unfortunately the format is not documented as far as I can tell, and the files appear to contain a lot of meta-data which seems to relate to the specifics of the GUI implementation in some way and/or to ST’s HAL library that CubeMX generates code for. It looks like a maintenance nightmare.

As with SVD files, while it is possible to parse at least some of the less obscure XML files to extract useful information (I did this), I’m not convinced of the quality, correctness, consistency or completeness of that information.

Time to roll my own

I decided to bite the bullet and create my own database. I won’t bore you with the details, but this entailed a lot of trawling through endless tables in the reference manual and the other resources already mentioned. The process was tedious beyond belief, and quite error prone. I captured the information in YAML, processed the YAML with a Python script, and ended up with a SQLite database containing everything I had gleaned from those sources. There was a certain amount of guesswork involved in creating the database schema because I have no inside knowledge about what the hardware designers were thinking.

The database is a self-contained general purpose information store about STM32F4 devices – a large subset of them, at any rate. The schema is published – or would be, if the database ever saw the light of day. The database is relatively easy to write SQL queries for. It could in principle be used by any other tool, written by any third party, in any language, and running on any platform.

What the database contains

The following list represents more or less what I captured in the database. There are sure to be some other useful nuggets I overlooked or didn’t need for the experiment. And then there are electrical characteristics, and, and, …

  • For each particular chip variant covered by the database:
    • RAM size, flash size, number of pins, physical package, maximum clock speed, and so on.
    • Many variants contain exactly the same set of peripherals and whatnot, and I guessed there is a common die or something – same blob of silicon, but different packaging.
    • A map of logical pins (e.g. PA2) to the physical locations on each package (e.g. Pin 27).
  • For each distinct blob of silicon:
    • The set of instances of peripherals it has: TIM1, TIM2, SPI3, USART7, and so on. This includes things like GPIOx, DMAx, NVIC, EXTI, RCC and other blocks which are perhaps not strictly speaking regarded as peripherals.
    • Each peripheral can in a sense be regarded as an instance of a class. There are, for example, several different versions (classes) of the TIM peripherals, with various capabilities – these seem mostly to be subsets of some imaginary super-TIM.
    • A whole family devices might use one class for SPI, say, which is very similar but not identical to the class used by another family.
    • A given blob of silicon may contain a mixture of instances of different classes for each type of peripheral: TIM1 is nothing like TIM14, for example.
  • For each instance of a peripheral class:
    • The base address for its registers.
    • Its interrupt sources and the map to interrupt vectors in NVIC.
    • The set of bits and registers within RCC which are used to enable the peripheral.
  • For each peripheral class:
    • Its set of registers and their offsets from the instance base address.
    • For each register: its set of fields and their sizes and offsets in bits.
    • For each field: the type of data it contains (typically bool, int, or enum), whether it is read-only, write-only or whatever, and the range of permitted values.
    • For each enum the names and values of the enumerators. In some cases several fields/registers will use the same enum, e.g. GPIO mode bits.
    • For all of these things a description for human readers.
  • The set of all interrupt vectors in NVIC:
    • These are also part of the map to interrupts sources on particular peripheral instances.
  • For the DMA peripherals:
    • The set of streams and channels and how those are mapped to particular functions for particular peripheral instances.
  • For each GPIO pin:
    • The set of all alternate functions for each pin – which functions can be performed for which peripherals. This is quite a large table in the each datasheet.
  • Details of the clock tree:
    • All the different clock sources and their relationships through multiplexers, prescalers, enable bits and so on. That is, all the information required to draw the clock tree diagram which can be found in the documentation (and in the CubeMX GUI – it’s very pretty), and more.
    • Which clocks relate to which peripherals, buses, RCC registers.

I am certain that my database schema could withstand a lot of improvement. I am also certain that my understanding of the internals of the various devices is flawed, and that the notional relationships I have invented to take advantage of apparent commonality between chips and peripherals is mistaken or misguided. I would love to get this stuff right. That would seem to require some input from ST’s digital hardware designers.

There is no rocket science in any of this. Just many hours of slogging my way through datasheets, parsing SVD and XML files. It really is not technically difficult but I have not found an existing resource like this anywhere. Perhaps I didn’t look hard enough. This seems very odd to me, because the nature of the data is absolutely begging for such treatment to allow users of these devices to exploit them more easily.

Example query

List all the pins which can be used for TX on USART2:

    SELECT DISTINCT port.name, pin.pin_index
    FROM GPIOPort port
    JOIN GPIOPin pin ON pin.fk_gpio_port = port.pk
    JOIN GPIOAltFunc af ON af.fk_gpio_pin = pin.pk
    JOIN GPIOAltFuncType aft ON aft.pk = af.fk_gpio_af_type
    JOIN PackagePin pack ON pack.gpio_name = pin.name
    WHERE port.fk_gpio_class = 1 AND aft.name = 'USART_TX' 
        AND pack.fk_package = 1 AND af.instance in ('USART2')
    ORDER BY port.name, pin.pin_index

It looks a bit complex, but isn’t really all that bad. There are tables for port, pin, alternate functions, types of alternate functions and a map from logical to physical pins (not all logical pins are exposed on smaller packages). These are joined and filtered with the name of the peripheral and the function we want. A couple of the foreign keys are numbers determined by the chip selection, but the bits for USART2 and USART_TX are clear.

I worked it out in SQLite Studio at first, and this one is generated in Python. In any case, end users would not normally be expected to write queries at all. Rather, this is the sort of thing that might be used internally by a configuration tool.

We could just as easily ask the database for all devices which match some specification (so many UARTs, so many SPIs, so much RAM, …), so that we can select the smallest for our project. Or for pin compatible devices. Or for all the different peripheral interrupt sources which share a given vector. Or…

A custom configuration tool

Imagine I am configuring a USART/UART driver in the tool shown below:

  1. The software already knows what particular STM32F4 I am using.
  2. I create a new instance of the driver USARTDriver from a menu of available driver classes. This is a GUI proxy for the actual USARTDriver class in my library.
  3. In the dialog that appears, I choose a UART/USART from a drop down list. The list contains only the peripherals that are available on the chip. It highlights the ones I have already used. I choose USART2.
  4. Now I choose a TX pin from a drop down list in the dialog. The list contains the results of the query above (PA2 and PD5, it turns out). It highlights a potential pin conflict: I have already used PA2 elsewhere in this case. Double assignment is permitted (you’re fine if the firmware serialises resource usage), but it’s good to know.
  5. And on to the RX pin, DMA streams, and other settings.
  6. The relevant DMA and USART interrupts are found automatically, as are the relevant RCC clock bits and other information that requires no user selection.

Of course, you will say that this is precisely what CubeMX does for us already. And you’d be right. Well sort of, I decided that a driver-level abstraction was better than CubeMX’s peripheral-level abstraction. And I generate code in C++ directly against my own library, which is not connected ST’s HAL. Later on I could update the tool to optionally generate Rust instead or, Heaven forfend, C.

This configuration tool is very far from complete – it’s just a toy really. I wrote it in Python as a demonstrator for the database. The point is that doing so was relatively straightforward.

A hardware abstraction layer

Creating a nice simple C++ HAL for STM32s has been something of a hobby horse of mine for a while. This whole database idea came up as an offshoot of that project.

If you have read Traits for wakeup pins, you may recall that I created a small compile time lookup table to help me not shoot myself in the foot with an invalid pin assignment on a Mighty Gecko. With a SQL database for this family of Silicon Labs devices such as I created for STM32F4s, it would be very simple to write a little script to generate exactly that table.

In fact, it would be simple to generate all kinds of tables to capture all kinds of information and relationships that would help to enforce correct combinations of pins, peripherals, functions, interrupt vectors and a dozen other things at compile time. At compile time! HALs usually contain hundreds or thousands of #defines for constants of various kinds, but little or nothing which explains their relationships or enforces any constraints on their use. Would it be good if the vendor-supplied HAL did those things? Personally, I think it would be awesome.

Generating some or all of the HAL

I would go further. Existing HALs contain a lot of duplication and/or conditional compilation. It’s wasteful and ugly and difficult to follow, and often depends on obscurantist preprocessor magic to select for your target device. But if creating big chunks of the HAL is a matter of running a script, why not do it for your specific project? That is to say, for your specific target device. Suppose the vendor supplied a script to do this. Or your could just write one yourself.

Such generated code would contain precisely the information you need, tailored to your particular device, and nothing more. It would be like having the datasheet in code form (derived from the datasheet in database form). Goodbye conditional compilation Hell.

And while we’re at it, why not also generate the SVD file your debugger needs for your particular device? Avoid filling up your hard disk with thousands of SDK files you will never use. 🙂

Testing for portability – sort of

Now suppose you have a fairly mature project and you decide you want to switch to a cheaper variant of the processor. Run the script to re-generate your HAL, build your project, see what won’t compile anymore. Oops! You used TIM14 and the cheaper device doesn’t even have it. Oops! You used PF7 and the smaller package doesn’t bring that out to a physical pin.

Go one step further: make the HAL generation a build step. Is this fantasy? Could it really be so straightforward? I believe it could. HAL code in C usually contains common functions across a whole family of devices. If your application is written in terms of a common API, but makes use of device-specific lookup tables for compile time configurations, then it should work.

There are, of course, no magic bullets. But I am absolutely certain that we could do a lot better in this area than we do currently.

A plea to microprocessor vendors

The pitch

I believe that ST and other vendors would be doing themselves and their users an enormous favour by creating such databases themselves. One for each family or sub-family of parts, say – whatever makes the most sense with their range(s) of hardware.

It seems obvious to me that the vendors already have at their disposal all of the information I want. And they already fully understand it and all the relationships within it which can be used to exploit commonality. To what extent their data is machine readable, I daren’t guess. However they do it, it is a fact that they produce enormous and generally accurate datasheets and reference manuals, which I suspect are something of a maintenance nightmare. What if all the tables in the datasheets could be created with automation which queries a database like mine? Just leaving that there.

Vendors also produce a great deal of code in the form of hardware abstraction libraries, board support packages, and the low-level ARM CMSIS stuff. It’s a sad truth that this code in always written in C, and is often of somewhat poor quality (hardware engineers should stick to what they know). To what extent the code is generated, I also dare not guess. Big chunks of it, with hundreds or thousands of macro constants, might be generated (CMSIS looks likely). There is a huge amount of duplication and/or conditional compilation. I dread to think what a maintenance headache that must be if it is all done by hand. It’s all a horrible mess, either way, and I am sure we could do better.

Finally, at least some vendors produce configuration tools and code generators which theoretically help users to get going on their projects. They don’t make money directly from their support code or their tooling. These represent a kind of loss leader intended to make their parts more accessible and thereby more attractive to developers. It seems to me that the burden of creating all that code and tooling must be quite heavy.

I believe that providing databases like the one I created (only more so) would be a powerful enabler for third parties, private projects, and the open source community. Creating tools is pretty straightforward. Obtaining the megabytes of data needed to make the tools useful is not. Enabling other motivated types to create tools and libraries could reduce the burden/expectation on the vendors, and (I’m getting carried a bit away here) might even help to increase the general quality of libraries and tooling in the embedded domain which, if we’re honest, is not really well served at the moment.

And finally… the plea

  • To silicon vendors in general: Can we please make this happen? It could be amazing.
  • To STMicroelectronics: Someone has to go first, right? I have ideas. Let’s talk.

Traits for wake-up pins

This is a real example from an embedded project I’ve been working on. The goal was to convert a potential run time error into a compile time error. If you haven’t done so already, please take a look at Traits templates 101, which introduces some of the ideas discussed below.

The scenario

The device is a Silicon Labs Mighty Gecko Zigbee doodad which needs to enter the microprocessor’s deep sleep state (a mode called EM4H – H for “hibernate”) for long periods of time in order to conserve energy. The chip is pretty much dead in this state, but can run the RTC. Aside from regular wake-ups driven by the RTC, the device also needs to wake up when the user presses a button.

This is a very simple thing to do in principle. You have to configure a GPIO pin as an input, you have to enable interrupts from that input, and you have to enable the EM4 wake-up capability for that pin. There are a few other bits and pieces, but those are the essentials.

As it turns out, there are only a very limited range of pins on the processor which can actually wake it up from EM4. Any pin can be used to generate interrupts, but only certain specific pins can bring the device out of hibernation. Worse, enabling the EM4 wake-up capability for a pin requires setting a bit in some other register whose bit index is not remotely related to the pin index.

Situations like this are very common in embedded development. I guess there were compromises when designing the silicon, or there was no decent coffee on hand, or something. You needs to read the datasheet very carefully to make sure you select an appropriate pin for the button and the correct EM4 enable bit.

Assuming the board design is correct, it would be very easy indeed to configure the wrong pin or something in the firmware, and you wouldn’t know about this until your button didn’t do anything three months from now when the board is finally ready. Or later refactoring, pin re-assignments, or something might cause problems. This might be relatively easy to fix, but we can do a little better. And yes, of course unit testing is a thing. Testing does present challenges when you get to down to the metal, so anything else that helps is good.

Step1: Create a compile time lookup table

We will use a simple custom trait class to create a lookup table which captures the relevant information from the data sheet.

Create the primary template

template <GPIO_Port_TypeDef PORT, 
    uint32_t PIN, bool VALUE = false> 
struct GPIOEM4Index
{
    static_assert(VALUE, "Given port and 
        pin are not allowed for EM4 wake-up.");
};

This template is parametrised on three non-type values. The first is a member of the enumeration GPIO_Port_TypeDef, which is defined in the vendor support library (EMLIB). The second is the index of a pin. The last is a dummy boolean value. The template, when it is instantiated, uses a static assertion to force a compilation error with a hopefully useful message. The boolean parameter prevents the assertion from giving errors even when not instantiated. I’m not sure if this is generally necessary, but it was with the IAR compiler (EDIT: This also appears to be true of gcc).

Create the lookup table with template specialisations

template <> struct GPIOEM4Index<gpioPortF,  2U> 
{ static constexpr uint32_t exti_level = GPIO_EXTILEVEL_EM4WU0; };  
template <> struct GPIOEM4Index<gpioPortF,  7U> 
{ static constexpr uint32_t exti_level = GPIO_EXTILEVEL_EM4WU1; };  
template <> struct GPIOEM4Index<gpioPortD, 14U> 
{ static constexpr uint32_t exti_level = GPIO_EXTILEVEL_EM4WU4; };  
template <> struct GPIOEM4Index<gpioPortA,  3U> 
{ static constexpr uint32_t exti_level = GPIO_EXTILEVEL_EM4WU8; };  
template <> struct GPIOEM4Index<gpioPortB, 13U> 
{ static constexpr uint32_t exti_level = GPIO_EXTILEVEL_EM4WU9; };  
template <> struct GPIOEM4Index<gpioPortC, 10U> 
{ static constexpr uint32_t exti_level = GPIO_EXTILEVEL_EM4WU12; }; 

The table is a list of template specialisations for each pin that supports the wake-up feature. The table creates a compile time map from (port, pin) pairs to particular values of a constant which I have called exti_level – this corresponds to the name of the register to which this datum relates. GPIO_EXTILEVEL_EM4WU0 and so on are constants defined in the vendor support library.

This map was obtained from a quick look at the datasheet. As you can see, there are just six pins which can be used to wake the device from EM4H. Wake-up source 0 is tied in the hardware to pin PF2, WU1 to PF7, and so on…

GPIOEM4Index is a simple example of a trait class. I was going to say “type trait”, but it is parametrised on multiple non-type parameters, rather than a single type parameter. It hardly matters in practice, and it works in essentially the same way.

Step 2: Use the lookup table at compile time

In my software, a digital input is an instance of a class called GPIOInput. We don’t care about it’s internals here, but it just wraps a few calls to functions in EMLIB. Each input is configured by passing a structure to its constructor. One of these:

struct GPIOInputConfig
{
    GPIO_Port_TypeDef port;
    uint32_t          pin; // MISRA didn't like uint8_t
    bool              wakeup;
    uint32_t          exti_level;
    ... // Other stuff we don't need here
};

// In class GPIOInput
// GPIOInput(const GPIOInputConfig& conf);

I’ve omitted all the stuff about pull-up/pull-down behaviour, which edges cause interrupts, and so on. You can easily extend the code to handle these later.

The configuration for a particular pin can be defined as follows. The structure button_conf is a compile time constant, and is passed to the GPIOInput constructor at some point in the firmware.

static constexpr GPIOInputConfig button_conf = 
{ 
    BTN_PORT, 
    BTN_PIN,
    true,
    GPIOEM4Index<BTN_PORT, 
        BTN_PIN>::exti_level
};

BTN_PORT and BTN_PIN are #defines generated by the Simplicity Studio configuration tool, but could just as easily be hand-written names, or explicit hard-coded values. The point to note is this: if the (BTN_PORT, BTN_PIN) combination does not appear in the lookup table, the code will simply not compile. And it will tell you why not. I love this. Using this code, it is now more difficult to create an invalid pin configuration for this feature. We have made a potential run time problem into a potential compile time problem.

But the code could stand some improvement. There are at least two obvious problems.

  1. BTN_PORT and BTN_PIN are duplicated. This is a potential source of error, such as editing only one of the duplicates. I know so because I made just such an error while developing this code. In fact, you could forget to use GPIOEM4Index altogether.
  2. GPIOInput can be used for pins which are digital inputs, but which do not need to wake the device up from hibernation. The code won’t compile. We can just pass 0 instead of invoking the GPIOEM4Index template, but this is not very satisfactory, and kind of undermines our efforts.

Remove the duplication of names

The solution to this is to create another template:

template <GPIO_Port_TypeDef PORT, 
          uint32_t PIN,
          bool EM4WU>
struct GPIOInputConfigT : GPIOInputConfig
{
    constexpr GPIOInputConfigT() 
    : GPIOInputConfig{PORT, PIN, EM4WU, 
          GPIOEM4Index<PORT, PIN>::exti_level} 
    {
    }
};

And we use it like this:

static constexpr GPIOInputConfigT 
    <BTN_PORT, BTN_PIN, true> button_conf;
  • GPIOInputConfigT inherits from GPIOInputConfig and adds nothing more than a default constructor which initialises all the values in the base class.
  • Note that the constructor is marked constexpr so that it can “run” at compile time.
  • The values we want are all passed as non-type template parameters and forwarded directly to the base struct.
  • The invocation of GPIOEM4Index in internalised, and the duplication of names is avoided by using this template. You also can’t forgot to use it now.

Isn’t that neat? The code for the button’s configuration is now both smaller and better.

You may be slightly worried that button_conf is now an instance of a derived struct, so that when we pass it to the GPIOInput constructor, we will slice the object. There is nothing to slice off anyway, so we lose nothing. There are probably other ways to achieve the same result, such as a constexpr function call. But this works just fine.

Inputs that can’t (or won’t) wake up the device

GPIOInputConfigT is nice, but makes the second issue worse: we can no longer pass 0 for the exti_level value for the pins which don’t have or don’t need this feature. The code won’t compile.

You may have noticed the boolean wakeup value in the configuration structure. This is intended to tell the GPIOInput object whether or not to enable the EM4 wake-up feature for its pin. The solution to our problem involves testing this value at compile time. It is passed to the template as EM4WU:

constexpr GPIOInputConfigT() 
: GPIOInputConfig{PORT, PIN, EM4WU} 
{
    if constexpr (EM4WU)
    {
        em4_exti_level = GPIOEM4Index<PORT, PIN>::exti_level; 
    }
}
  • The constructor is re-written to use the “if constexpr” construction.
  • This evaluates the test at compile time and either generates the dependent code or doesn’t, depending on the value.
  • We check the value of EM4WU, a boolean. If it is true, we invoke GPIOEM4Index. If not we do nothing. Problem solved.
  • It might make sense to set a default of 0 or something here or in the base struct declaration, but it doesn’t really matter since the flag is false. The aggregate initialisation of GPIOInputConfig should do this for us anyway.

Now we can use GPIOInputConfigT to create constexpr configurations for all of our digital inputs whether they need the EM4 wake-up feature or not.

Note that none of this requires us to statically initialise our actual GPIOInput instances. We can create them, or not, as necessary at run time. If you should need to select the port and pin dynamically (unlikely), then this technique is not going to help so much. But you’d assert on a lookup table at run time, right?

I can think of other improvements, such as changing the wake-up flag from a bool to an enumeration, and similarly with the pin index. And some better names might be in order.

Conclusion

That might have seemed like a lot of typing (it took under twenty minutes to get this working), but the benefit is that we have made configuring digital inputs a little bit safer in the face of board revisions which change the pins around, and of many others sources of discrepancy between what we intended and what we did.

The ability to convert run time faults into compile faults is one of my favourite advantages of using C++ for embedded software development. Just try doing that with C. All the magic happens in the compiler, and it has literally cost nothing in terms of image size or performance. Since it is all done in terms of compile-time constants, we don’t even have to optimise to make the template stuff evaporate: no actual code is generated.

Much as I rail against the apparent growing obsession with template-meta-program-all-the-things, little tricks like this are just awesome. Does this count as TMP anyway? I don’t know. Maybe. Entry level. All that matters is it isn’t very hard to understand and it puts power in your hands.

If you work on a lot of devices using the same hardware, you could move the code into a little header-only library called “Mighty Gecko FootGun Controls” or whatever, and it the effort would pay for itself in no time.

Traits templates 101

If you are not familiar with them, traits in C++ are a simple method of using templates to associate custom information with specific data types (and/or with specific data values) at compile time. If you think for a moment about the built-in integral types: int, short, long, and so on, you already know that these types each have a maximum value, a minimum value, the fact of whether they are signed, and a whole lot of other information. These bits of information are traits.

Traits templates are a big part of template meta-programming, but don’t let that concern you. The basic idea is no more complex than already stated.

Traits can be pretty much anything: constant values, other types, functions, and so on. They are sometimes implemented as recursive “functions” which execute at compile time. The ability to associate all kinds of information with types at compile time allows us to make decisions at compile time and control meta-programs. Such meta-programs can be arbitrarily complex and present something of a challenge, to say the least, to many developers. But traits can also be used for much more mundane tasks.

Note: I must admit to being slightly confused by some of the jargon around this. We have traits, traits templates, type traits, and probably other names which may or may not all mean the same thing. I’m not sure it matters.

For our purpose here, I’ll stick to constants, and develop a simple type trait.

A magic number type trait

Suppose that, for some reason, you would like to associate a single integer value called magic with certain data types. For example, int gets the value 123, float gets 456, and maybe some others. The values are meaningless, but that’s magic numbers for you. 🙂

Step 1: Primary template

First, we define a primary template to deal with the general case. We’ll make the integer value 0 unless otherwise specified:

template <typename T> struct MagicNumber 
{ static constexpr int magic = 0 };

In this example a template called MagicNumber is created. It is parametrised by some type which we have called T. This is just a stand in for the name of a concrete type. Remember that a template is not really code: it is rather a set of instructions to the compiler to tell it how to generate code later when we want it to. Think of T as a compile time variable. A template is in some ways just a fancy macro.

We tell the compiler to generate code by instantiating the template. We do this by providing the template with a value (that is, a type) for T, such as int, double, bool, MyClass, std::array<int,31> or whatever. In this case, the generated code will be a struct which defines an integer constant called magic with value 0. If T is int, the compiler will generate a struct called MagicNumber<int>. Don’t be fazed by the angle brackets: it’s just a name. If it helps, imagine the new struct being called MagicNumber_int instead.

// Generated data type
struct MagicNumber<int> 
{ static constexpr int magic = 0 };

// Or, equivalently, you might 
// have written this manually
struct MagicNumber_int 
{ static constexpr int magic = 0 };

Name mangling will most likely in reality give it some other name in your debugger: the point is that some code – a custom data type – is generated.

Step 2: Template specialisations

Second, we define one or more template specialisations for particular parameter types – that is, for particular values of T. For special cases, so to speak. This is how we associate different magic numbers with different types:

template <> struct MagicNumber<int> 
{ static constexpr int magic = 123; };

template <> struct MagicNumber<float> 
{ static constexpr int magic = 456; };

A specialisation is just different version of the template which is used instead of the primary for a specific value for T. The compiler will use the specialisation in place of the primary template when generating code if the given value of T matches. In this example, the code creates a specialisation for T = int. In this case MagicNumber is exactly the same struct as before, but the constant magic now equals 123. There is a second specialisation for T = float, with a constant value of 456.

Note that specialisation is not the same as instantiation. We still haven’t generated any code. We’ve just made the instructions to the compiler more detailed, more specific.

And that’s it.

Step 3: Using our type trait

So now let’s use this template. We might create an array whose size, for some reason, is the associated magic number:

int main()
{
    std::array<int, MagicNumber<int>::magic> int_array;
    for (int& value : int_array)
    {
        // Do something... 
    }
}

Or we might just print out the magic numbers for a bunch of types:

int main()
{
    // Prints 123
    std::cout << MagicNumber<int>::magic << '\n'; 
    // Prints 456
    std::cout << MagicNumber<float>::magic << '\n'; 
    // Prints 0
    std::cout << MagicNumber<bool>::magic << '\n'; 
    // Prints 0
    std::cout << MagicNumber<double>::magic << '\n'; 
}

Or a whole bunch of other things.

MagicNumber is a very simple example of a type trait. We have told the compiler to associate the number 123 with type int, 456 with type float, and 0 with all other types. That’s neat. If it didn’t exist already, you could implement sizeof using this technique. In fact, the standard library uses type traits a bit like this to return the maximum values (and many other things) of the various built-in types. https://en.cppreference.com/w/cpp/types/numeric_limits.

Creating compilation errors

We can do a little more with this example. We can change the primary template so that it doesn’t define a constant at all.

It is important to understand that template specialisations don’t need to define the same set of traits as the primary template (or each other). Specialisation is not like inheritance, but rather creates a wholly different version of the template – one which is used in special cases. 🙂

One way that we can use this fact is by forcing a compilation error whenever the primary template is instantiated:

// Primary doesn't contain any magic.
template <typename T> struct MagicNumber {  };
int main()
{
    // Prints 123
    std::cout << MagicNumber<int>::magic << '\n'; 
    // Prints 456
    std::cout << MagicNumber<float>::magic << '\n'; 
    // ERROR! Does not compile
    std::cout << MagicNumber<bool>::magic << '\n'; 
    // ERROR! Does not compile
    std::cout << MagicNumber<double>::magic << '\n'; 
}

The code now only compiles for the specialisations. For all other types, the generated code is an empty struct which does not even have a member called magic. That is to say, other types do not have a trait called magic. If we attempt to use MagicNumber for any types other than int or float, then the program will simply not compile. That’s power.

This simple technique can be used to convert potential run time errors into compile time errors. Fixing compile time errors is a lot simpler and quicker than finding and fixing run time errors. Imagine, for example, that you could force a compilation error if you accidentally configured a microprocessor peripheral with an invalid pin. That sounds useful to me. I explore this a little in another article: Traits for wakeup pins.

Non-type template parameters

You can also define templates whose parameters are particular values of types rather than types. These are non-type parameters. Here’s a silly example which gives you the twin primes of a few numbers. Since there are an infinite number of such prime pairs, this is not a great design…

template <int N> struct TwinPrimeOf { };

template <>
struct TwinPrimeOf<17> { static constexpr int twin = 19; };
template <>
struct TwinPrimeOf<19> { static constexpr int twin = 17; };

template <>
struct TwinPrimeOf<41> { static constexpr int twin = 43; };
template <>
struct TwinPrimeOf<43> { static constexpr int twin = 41; };

Such a template is not, I suppose, strictly speaking a type trait. I’m sure they’re called trait classes or some such, but for practical purposes the notion is exactly the same: you use information known at compile time to perform work at compile time.

Conclusion

Type traits, or traits templates, are a simple but powerful mechanism for associating information with types and/or values at compile time. This article has barely scratched the surface of what can be done with them, but we can see already that with very little effort, we can start to do useful work with type traits in our own software.