Disclaimer I am not a computer scientist, nor a C++ expert. If the terminology I use in this post is not 100% accurate, I apologize.
Introduction
When writing C++ libraries, I tend to shy away from inheritance. Now, sometimes inheritance is the right tool for the job, and if you have a "B is an A" relationship, then inheritance is a great way to reuse code. But in these cases, I prefer to use non-virtual inheritance, so everything is set at compile time. But sometimes, you need to put off specifying the type of an object until runtime (for example, if the type is going to depend on some user input). In these cases, static polymorphism wont work.
There are a few options when we need runtime polymorphism. The builtin, "classic", approach is to use virtual functions. We define a common base class with virtual methods that derived classes then implement. You can then use a base class pointer to point at instances of the derived classes, which can be allocated at runtime. The problem is, we have to use pointers. We also have to make sure we use virtual methods, including a virtual destructor, because this is how C++ makes sure the derived class implantations will be called.
A better approach is to use type-erasure, where the inheritance and virtual methods are hidden and the user sees a type that can store instances of "any" type that conform to the required interface. I've written about this in the past, and it is a really nice, generic solution. The problem though is that there is a lot of boiler plate code. We have to define an interface class, a wrapper class (either external or internal to our type-erasing class), and a type-erasing class. We also can't use method templates in our interface, virtual methods must be concrete.
Implementation
TL;DR
In the rest of the post, I will walk through the motivation behind this example. Basically, we use a std::visitor to store different types at runtime and wrap
it with a class that provides an object-like interface instead of having to deal with std::visit directly. For cases where you know the types you want to store up front,
it can be used to provide the same interface as the type erasure pattern.
// snippet-compiler.compiler-command: g++ -std=c++17 {file}
#include<iostream>
#include<variant>
struct Calc1{
template<typename T>
double process(const T& t) const {return t.value();}
};
struct Calc2{
template<typename T>
double process(const T& t) const {return 1+t.value();}
};
struct VarA{
double value() const {
return 1;
}
};
struct VarB{
double value() const {
return 2;
}
};
struct MultiCalc {
private:
std::variant< Calc1, Calc2 > storage;
public:
MultiCalc() = default;
template<typename T>
MultiCalc(T t):storage(std::move(t)){}
template<typename T>
double process(const T& t) const {
return std::visit( [&t](const auto& calc){return calc.process(t);}, storage );
}
};
int main()
{
VarA a;
VarB b;
Calc1 calc1;
Calc2 calc2;
MultiCalc acalc;
acalc = calc1;
std::cout << acalc.process(a) << std::endl;
std::cout << acalc.process(b) << std::endl;
acalc = calc2;
std::cout << acalc.process(a) << std::endl;
std::cout << acalc.process(b) << std::endl;
return 0;
}
Which will give the following output:
1
2
2
3
Explanation
So what if we have some similar types that share a common interface, but the interface contains some function template methods? I write a lot of code that does some sort of calculation with some sort of "data", usually in a physics simulation. So I will often have a group classes that represent some data, and another group of classes that perform some calculation on the data. Here is a (contrived) example:
#include<iostream>
struct Calc1{
template<typename T>
double process(const T& t) const {return t.value();}
};
struct Calc2{
template<typename T>
double process(const T& t) const {return 1+t.value();}
};
struct VarA{
double value() const {
return 1;
}
};
struct VarB{
double value() const {
return 2;
}
};
int main()
{
VarA a;
VarB b;
Calc1 calc1;
Calc2 calc2;
std::cout << calc1.process(a) << std::endl;
std::cout << calc1.process(b) << std::endl;
std::cout << calc2.process(a) << std::endl;
std::cout << calc2.process(b) << std::endl;
return 0;
}
Which will give the following output.
1
2
2
3
Here we have two types of data VarA and VarB which both provide a .value() method. Then we have two different calculators, Calc1 and Calc2, which will perform some calculation when given
some data. The calculation is implemented as a function template method, so it will work with any data that provides a .value() method. Pretty standard stuff.
Now say we wanted to select the calculator to use at runtime. Something like this:
MultiCalc calc;
if( user_input == 1){
calc = Calc1();
}
else if( user_input == 2) {
calc = Calc2();
}
VarA a;
calc.process(a);
How can we achieve this? Well, we could certainly write a type-erasing container that would store objects of any type that had a .process(...) method, but the signature of the method would have to be concrete, we would have
to specify an argument type. Of course, we could just create a type-erasing container for the variable types as well, then use that for our concrete method interface, and it would work with any type that had a .value() method.
But this would only work if all of our data types conformed to the same interface, which is certainly the case here. But sometimes we might have a calculator that provides specializations of the .process(...) method template
to deal with certain data types, or, in my case, uses SFINAE to check if the data type provides some method and then use a default if not.
The solution is to use std::variant. std::variant was added to the C++ standard in 2017, and its a way to store objects of different type in the same container. The catch is that all of the possible types we might want to store needs
to be known at compile time.
The problem here is that we have to deal with std::visit(...), which isn't that bad, but it is not a pattern that is familiar to a lot of programmers. It would be nice if we could give an object-style interface to this. So, let's
wrap calls to std::visist in a class.
// snippet-compiler.compiler-command: g++ -std=c++17 {file}
#include<iostream>
#include<variant>
struct Calc1{
template<typename T>
double process(const T& t) const {return t.value();}
};
struct Calc2{
template<typename T>
double process(const T& t) const {return 1+t.value();}
};
struct VarA{
double value() const {
return 1;
}
};
struct VarB{
double value() const {
return 2;
}
};
struct MultiCalc {
private:
std::variant< Calc1, Calc2 > storage;
public:
MultiCalc() = default;
template<typename T>
MultiCalc(T t):storage(std::move(t)){}
template<typename T>
double process(const T& t) const {
return std::visit( [&t](const auto& calc){return calc.process(t);}, storage );
}
};
int main()
{
VarA a;
VarB b;
Calc1 calc1;
Calc2 calc2;
MultiCalc acalc;
acalc = calc1;
std::cout << acalc.process(a) << std::endl;
std::cout << acalc.process(b) << std::endl;
acalc = calc2;
std::cout << acalc.process(a) << std::endl;
std::cout << acalc.process(b) << std::endl;
return 0;
}
Which will give the following output:
1
2
2
3
Now we have a type, MultiCalc, that can store objects of type Calc1 and Calc2, and can process any data type that provides a .value() method. If we compare this to the type erasure example, it is a lot less code.
What's really nice is that we don't have to worry about all of the details around construction and assignment to make this efficient like we do with type erasure, std::variant takes care of all that.
We want to be able to construct and assign to MultiCalc from Calc1 and Calc2 objects, but if the objects are
temporary, we should move when possible. We also will want to be able to construct and assign from other MultiCalc objects, and again, if the object is a temporary, we want to move. Lets instrument the previous
example to see when the underlying calculators are copied and moved:
#include<iostream>
#include<variant>
struct Calc1{
Calc1() = default;
Calc1(const Calc1& c){std::cout<<" Calc1::copy"<<std::endl;}
Calc1(Calc1&& c){std::cout<<" Calc1::move"<<std::endl;}
Calc1& operator=(const Calc1& c){std::cout<<" Calc1::=copy"<<std::endl; return *this;}
Calc1& operator=(Calc1&& c){std::cout<<" Calc1::=move"<<std::endl; return *this;}
template<typename T>
double process(const T& t) const {return t.value();}
};
struct Calc2{
Calc2() = default;
Calc2(const Calc2& c){std::cout<<" Calc2::copy"<<std::endl;}
Calc2(Calc2&& c){std::cout<<" Calc2::move"<<std::endl;}
Calc2& operator=(const Calc2& c){std::cout<<" Calc2::=copy"<<std::endl; return *this;}
Calc2& operator=(Calc2&& c){std::cout<<" Calc2::=move"<<std::endl; return *this;}
template<typename T>
double process(const T& t) const {return 1+t.value();}
};
struct MultiCalc {
private:
std::variant< Calc1, Calc2 > storage;
public:
MultiCalc() = default;
template<typename T>
MultiCalc(T t):storage(std::move(t)){}
template<typename T>
double process(const T& t) const {
return std::visit( [&t](const auto& calc){return calc.process(t);}, storage );
}
};
int main()
{
Calc1 calc1;
Calc2 calc2;
std::cout << "Construct from lvalue" << std::endl;
MultiCalc acalc1 = calc1;
std::cout << "Assign from lvalue" << std::endl;
acalc1 = calc2;
std::cout << "Construct from rvalue" << std::endl;
MultiCalc acalc2 = Calc1();
std::cout << "Assign from rvalue" << std::endl;
acalc2 = Calc2();
std::cout << "Copy construct from lvalue" << std::endl;
MultiCalc acalc3( acalc1);
std::cout << "Copy assign from lvalue" << std::endl;
acalc3 = acalc2;
std::cout << "Copy construct from rvalue" << std::endl;
MultiCalc acalc4 = MultiCalc(Calc1());
std::cout << "Copy assign from rvalue" << std::endl;
acalc4 = MultiCalc(Calc2());
return 0;
}
Compile and run, and we get this output:
Construct from lvalue
Calc1::copy
Calc1::move
Assign from lvalue
Calc2::copy
Calc2::move
Calc2::move
Construct from rvalue
Calc1::move
Assign from rvalue
Calc2::move
Calc2::move
Copy construct from lvalue
Calc2::copy
Copy assign from lvalue
Calc2::=copy
Copy construct from rvalue
Calc1::move
Copy assign from rvalue
Calc2::move
Calc2::move
So here we can see that when we construct or assign from an lvalue, we will get a copy for the stored type. But if we copy or assign from a temporary, we get a move. Note that we are using Sean Parents guideline here, our constructor takes its argument by value and moves into store. That is actually causing an extra move in some cases, which Sean points out may happen, but that's not terrible. We can get rid of the extra move by implementing copy and move constructors and assignment operators:
#include<iostream>
#include<variant>
struct Calc1{
Calc1() = default;
Calc1(const Calc1& c){std::cout<<" Calc1::copy"<<std::endl;}
Calc1(Calc1&& c){std::cout<<" Calc1::move"<<std::endl;}
Calc1& operator=(const Calc1& c){std::cout<<" Calc1::=copy"<<std::endl; return *this;}
Calc1& operator=(Calc1&& c){std::cout<<" Calc1::=move"<<std::endl; return *this;}
template<typename T>
double process(const T& t) const {return t.value();}
};
struct Calc2{
Calc2() = default;
Calc2(const Calc2& c){std::cout<<" Calc2::copy"<<std::endl;}
Calc2(Calc2&& c){std::cout<<" Calc2::move"<<std::endl;}
Calc2& operator=(const Calc2& c){std::cout<<" Calc2::=copy"<<std::endl; return *this;}
Calc2& operator=(Calc2&& c){std::cout<<" Calc2::=move"<<std::endl; return *this;}
template<typename T>
double process(const T& t) const {return 1+t.value();}
};
struct MultiCalc {
private:
std::variant< Calc1, Calc2 > storage;
public:
MultiCalc() = default;
MultiCalc(const MultiCalc&) = default;
MultiCalc(MultiCalc&) = default;
MultiCalc(MultiCalc&&) noexcept = default;
MultiCalc& operator=(MultiCalc&) = default;
MultiCalc& operator=(const MultiCalc&) = default;
MultiCalc& operator=(MultiCalc&& other) noexcept { this->storage = std::move(other.storage); return *this;};
template<typename T>
MultiCalc(T&& t):storage(std::forward<T>(t)){}
template<typename T>
MultiCalc& operator=(T&& t) { storage = std::forward<T>(t); return *this; }
template<typename T>
double process(const T& t) const {
return std::visit( [&t](const auto& calc){return calc.process(t);}, storage );
}
};
int main()
{
Calc1 calc1;
Calc2 calc2;
std::cout << "Construct from lvalue" << std::endl;
MultiCalc acalc1 = calc1;
std::cout << "Assign from lvalue" << std::endl;
acalc1 = calc2;
std::cout << "Construct from rvalue" << std::endl;
MultiCalc acalc2 = Calc1();
std::cout << "Assign from rvalue" << std::endl;
acalc2 = Calc2();
std::cout << "Copy construct from lvalue" << std::endl;
MultiCalc acalc3( acalc1);
std::cout << "Copy assign from lvalue" << std::endl;
acalc3 = acalc2;
std::cout << "Copy construct from rvalue" << std::endl;
MultiCalc acalc4 = MultiCalc(Calc1());
std::cout << "Copy assign from rvalue" << std::endl;
acalc4 = MultiCalc(Calc2());
return 0;
}
Which outputs the following:
Construct from lvalue
Calc1::copy
Assign from lvalue
Calc2::copy
Construct from rvalue
Calc1::move
Assign from rvalue
Calc2::move
Copy construct from lvalue
Calc2::copy
Copy assign from lvalue
Calc2::=copy
Copy construct from rvalue
Calc1::move
Copy assign from rvalue
Calc2::move
Calc2::move
So we got rid of several moves, but we had to write some boiler plate. For me, writing less code means less oportunity for bugs, so I'm happy with the extra moves.
So with this pattern we get a type-erasure-like with a lot less code.
The trade-offs with using this static-runtime object over a type-erasing object are the same as the trade-offs with using std::variant over a type-erasing container like Boost.Any.
We have to know all of the types we plan to store ahead of time, so we can't store arbitrary thrid-party object that satisfy our interface requiements. However, we could make this a little
easier to work around by making MultiCalc a class template that takes the stored class types as a parameter pack.
You could argue
that we are just providing some syntactic sugar on top of std::variant and std::visit, but our MultiCalc is a type that conforms to the calculator interface, which means we could also pass it to a function template that
expects a calculator. Even a function that is constrained by a concept.
One benifit is that we get to use method templates since there aren't any virtual functions. However, that also means concrete methods won't get instantiated unless they are used. So we coudn't use this to build a plugin that was going to go into some sort of shared library unless we manually declared the template instances we wanted.
This pattern solves an issue that I have ran into several times before in the past. I have a library that contains several unrelated types that provide the same interface, and I would like to
provide users a type that supports switching between them at runtime. For example, (libInterpolate)[https://github.com/CD3/libInterpolate] is a C++ library that implements several different methods for 1 and 2 dimensional interpolation.
To support switching between methods at runtime, I wrote an AnyInterpolator<...> class that can hold instances of different interpolators:
#include <libInterpolate/Interpolate.hpp>
#include <libInterpolate/AnyInterpolator.hpp>
...
_1D::AnyInterpolator<double> interp = _1D::CubicSplineInterpolator<double>();
interp.setData( x.size(), x.data(), y.data() );
...
// do some interpolation
...
// now change the interpolator
// you will need to reload the data though
interp = _1D::LinearInterpolator<double>();
interp.setData( x.size(), x.data(), y.data() );
...
// do some more inteprolation
...
I used type erasure for this, in fact I used the Boost library, but I could have done it with this static-runtime pattern. I plan to do some benchmarking, and if I find if this method is faster than a type-erasing container, I will
rewrite AnyInterpolator<...> to use it.