Simple Type Erasure in C++ with Legacy Code Integration

Introduction

Disclaimer I am not a computer scientist, nor a C++ expert. If the terminology I use in this post is not 100% accurate, I apologize.

Sometimes in C++, you need to store an object whose type is not known until runtime. For example, it may depend on user input or a configuration file setting. The "classic" method for doing this in C++ is to use inheritance with virtual methods. Another method that seems to becoming more popular is type erasure. Both methods solve the same problem, we would like to store objects of different types and call methods on them at runtime.

A common example given is when we need to store multiple types in a container:

class MyType1{...};
class MyType2{...};

std::vector<???> objects;

If we want to store both MyType1 and MyType2 objects in a std::vector, we have to do some extra work. The "classic" method, which is supported by the language, is inheritance. We create a common base type and have our types inherit from it.

class MyTypeInterface{...};
class MyType1 : public MyTypeInterface {...};
class MyType2 : public MyTypeInterface {...};

std::vector<MyTypeInterface*> objects;
objects.push_back( new MyType1() );
objects.push_back( new MyType2() );
...
objecs[0]->method1();
objecs[1]->method1();

Here, MyTypeInterface would define the set of methods that derived types must implement to be used through a MyTypeInterface*, and each concrete type would implement their own version of the interface.

There are two issues with this: 1. Any type that we want to store in a vector together must inherit from MyTypeInterface, which means we have to design for this behavior up front. 2. We have to deal with pointers, memory allocations, memory deallocations, and reference semantics.

Type erasure addresses both of these issues. With type erasure, we can create a "magic" type named AnyMyType that is capable of storing an object of any of our types and acts like a value (value semantics).

class MyType1 {...};
class MyType2 {...};

std::vector<AnyMyType> objects;
objects.push_back( MyType1() );
objects.push_back( MyType2() );
...
objecs[0].method1();
objecs[1].method1();

With type erasure, we can store unrelated types in a single container, and we don't have to manage memory.

Another use case for type erasure is to implement run-time dependency injection. For example, my interpolation library provides an AnyInterpolator type that can be used to store different types of interpolators, which then allows you to select the interpolation method that is used at runtime.

class Calcualtor
{
        protected:
        _1D::AnyInterpolator<double> m_interpolator;

        public:
        Calculator( _1D::AnyInterpolator<double> a_interpolator ):m_interpolator(a_interpolator) {}
};

auto calc1 = Calculator( _1D::LinearInterpolator<double>() );
auto calc2 = Calculator( _1D::CubicSplineInterpolator<double>() );

You could even provide a method to change the interpolation method if you wanted. There are many other use cases that make type erasure attractive.

In this post, I am not going to through the details of how type erasure works. There are plenty of good talks and posts on that subject:

  1. Sean Parent - Better Code: Runtime Polymorphism (this is the first time I saw type erasure)
  2. Sean Parent - Inheritance is the Base Class of Evil (this seems to be his most famous talk on the subject)
  3. Zack Laine - Pragmatic Type Erasure: Solving OOP Problems w/ Elegant Design Pattern (I really like Zack's talks)
  4. Arthur O'Dwyer - Back to Basics: Type Erasure
  5. Arthur O'Dwyer - What is Type Erasure
  6. Andrzej KrzemieĊ„ski - Type erasure - Part I

Instead, I will outline a pattern that can be used to provide traditional-style pointer-based runtime polymorphism and modern value-style runtime polymorphism that can be integrated with legacy code that uses inheritance.

The main issue with type erasure, as we will see, is boilerplate code. Creating a new type-erasing type requires a lot of repeated code. There are also several libraries that try to make type erasure easier to use: 1. Boost.TypeErasure 2. boost-ext/te 3. Dyno 4. Poly (Facebook)

Note: One thing I would like to point out is that when people say that type erasure gives us runtime polymorphism without inheritance, what they mean is that the user of the type-erasing class does not have to use inheritance in their classes. Under the hood, type erasure is using inheritance, virtual methods, and usually allocates memory. This is just hidden from the user, so that there is less risk of making a mistake.

Implementation

TL;DR

We are going to walk though the implementation of simple type-erasing container. Here is the full example for those that like to just look at the code.

#include <memory>
#include <iostream>
#include <vector>
/**
* A class to define the methods that will callable at runtime.
*/
class Interface
{
  public:
    virtual int getOpt1() const = 0;
    virtual void setOpt1(int) = 0;
    virtual ~Interface() = default;
    virtual std::unique_ptr<Interface> copy() const = 0;
};

/**
* A class for bringing new types into an inheritance tree non-intrusively.
*/
template<size_t N>
struct priority : priority<N-1>{};
template<>
struct priority<0>{};

template<typename T>
class Wrapper : public Interface, T
{
 public:
  using T::T;
  using T::operator=;
  Wrapper(const T& t) : T(t) {}
  Wrapper(const T&& t) : T(t) {}

  virtual int  getOpt1() const final{ return _getOpt1(priority<1>{}); }
  virtual void setOpt1(int a) final { _setOpt1(a,priority<1>{}); }
  virtual std::unique_ptr<Interface> copy() const {return std::unique_ptr<Interface>(new Wrapper<T>(*static_cast<const T*>(this)));}

 protected:
  template<typename TT = T>
  auto _getOpt1(priority<0>) const -> decltype(static_cast<const TT*>(this)->getOpt1()) { return static_cast<const TT*>(this)->getOpt1(); }
  template<typename TT = T>
  auto _getOpt1(priority<1>) const -> decltype(map_getOpt1(*static_cast<const TT*>(this))) { return map_getOpt1(*static_cast<const TT*>(this)); }

  template<typename TT = T>
  auto _setOpt1(int a,priority<0>) -> decltype(static_cast<TT*>(this)->setOpt1(a)) { static_cast<T*>(this)->setOpt1(a); }
  template<typename TT = T>
  auto _setOpt1(int a,priority<1>) -> decltype(map_setOpt1(*static_cast<TT*>(this), a)) {map_setOpt1(*static_cast<TT*>(this), a);}
};

/**
* A type-erasing container that can store and use any type providing the methods listed in Interface
*/
class Any
{
 private:
  std::unique_ptr<Interface> m_storage;

 public:
  int  getOpt1() const { return m_storage->getOpt1(); }
  void setOpt1(int a) const { return m_storage->setOpt1(a); }

  // type erasure magic!
  template<typename T>
  Any(const Wrapper<T>& t) : m_storage(new Wrapper<T>(t)) { }

  template<typename T>
  Any(const Wrapper<T>&& t) : m_storage(new Wrapper<T>(t)) { }

  template<typename T>
  Any(const T& t) : m_storage(new Wrapper<T>(t)) { }

  template<typename T>
  Any(const T&& t) : m_storage(new Wrapper<T>(t)) { }

  Any(const Any& a): m_storage( a.m_storage->copy() ) { }

  Any& operator=(const Any& a)
  {
    m_storage = a.m_storage->copy();
    return *this;
  }

};


class A
{
  public:
  void setOpt1(int a){}
  int getOpt1() const {return 1;}
};
class B
{
  public:
  void setOpt1(int a){}
  int getOpt1() const {return 2;}
};



int main()
{
  std::vector<Any> objects;
  objects.push_back( A() );
  objects.push_back( B() );

  Any any = objects[0];
  for(auto &item : objects )
    std::cout << item.getOpt1() << "\n";
  std::cout << any.getOpt1() << "\n";

}

/**
* output:
* 1
* 2
* 1
*/

Creating an Interface

The first thing we have to do is define the interface that the types we will store are going to expose (or what they will "afford" if you are Arthur). This should be a pure-abstract class with no data members. Let's consider a simple case, where we have collection of classes that support setting and getting a parameter called Opt1. We want to be able to store objects of these types in a single container and be able to call getOpt1() and setOpt1(int) on them.

class Interface
{
  public:
    virtual int getOpt1() const = 0;
    virtual void setOpt1(int) = 0;
    virtual ~Interface() = default;
};

Note that we included a virtual destructor in the interface. This is required to allow objects of derived types to be delete'ed correctly. If we don't make the destructor virtual, deleting objects that are pointed to by a Interface* will not be destroyed correctly.

Creating a Wrapper

With the interface class defined, we can now store objects of different types in a vector and call getOpt1() and setOpt1(int) on them. But, the objects must derive from Interface. In the "traditional" pattern, we would write all our classes and derive from Interface. But here, we don't want to require that a class inherit from interface to be used. Perhaps we didn't write the class, and it's not possible to modify its definition. Or, maybe we would like to define multiple interfaces that a single class could implement more than one of, and we don't want to inherit from every single interface we want to support.

Instead, we can create a wrapper class template that inherits from Interface and its template parameter:

template<typename T>
class Wrapper : public Interface, T
{...};

This class inherits from Interface, so it can be referenced with a Interface*. But it also inherits from T, so it can be used as a T as well. Note that this won't work with the primitive types, but here we are interested in calling methods on some object, so that's OK. Arthur O'Dwyer does something similar in his post What is Type Erasure, but the wrapper class holds a member of type T instead of deriving from it.

To implement the wrapper, we just need to forward calls to the interface methods to the type T.

template<typename T>
class Wrapper : public Interface, T
{
  public:
    using T::T;
    using T::operator=;
    virtual int getOpt1() const final { return static_cast<const T*>(this)->getOpt1(); }
    virtual void setOpt1(int a) const final { static_cast<T*>(this)->setOpt1(a); }
};

Note the using T::T and using T::operator=. These forward the constructors and assignment operators for T to Wrapper<T> so that we can construct and assign a Wrapper<T> just like we would a T.

class AType
{
  public:
    AType(int){...}
};
...
Wrapper<AType> obj(1); // this works because of `using T::T`
AType a(2);
obj = a; // this works because of `using T::operator=`

Now we can take any type that supports (or affords) the Interface interface, and store it in a std::vector<Interface*>.

std::vector<Interface*> objects;
objects.push_back( new Wrapper<MyType1>() );
objects.push_back( new Wrapper<MyType2>() );

...

objects[0]->setOpt1(1);
objects[1]->setOpt1(2);

This actually solves the first issue we have with using traditional inheritance. We can now store objects of unrelated types in a single container without modifying the classes. So, we can just write our classes to do what they need to do and worry about this runtime polymorphism later.

Creating a Type-erasing Container

We still need to address the second issue, Pointers and reference semantics. Here, we have to work with pointers, which is error-prone. In the example above, we need to remember to delete all the elements of objects for example. We should probably use smart pointer instead to automatically delete the elements for us. Either way, if we copy an element from in objects, we will get a copy of the pointer, not the value. So modifying the copy will modify the original.

The second goal of type erasure is to provide value semantics, an object that acts like a regular instance, and copies are independent of the original. We want something that behaves like this:

std::vector<Any> objects;
objects.push_back( Wrapper<MyType1>() );
objects.push_back( Wrapper<MyType2>() );

...

objects[0].setOpt1(1);
objects[1].setOpt1(2);

We can get the interface we want with another small wrapper class. We create a class named Any that stores an instance of an object using Interface*, and forwards calls to the stored object.

class Any
{
 private:
  std::unique_ptr<Interface> m_storage;

 public:
  int  getOpt1() const { return m_storage->getOpt1(); }
  void setOpt1(int a) const { return m_storage->setOpt1(a); }

  // type erasure magic!
  template<typename T>
  Any(const Wrapper<T>& t) : m_storage(new Wrapper<T>(t)) { }

  template<typename T>
  Any(const Wrapper<T>&& t) : m_storage(new Wrapper<T>(t)) { }
};

Here, we store a std::unique_ptr<Interface> to an object of type Interface. The getOpt() and setOpt(int) method are implemented by just forwarding them to the stored object. Note that these methods are non-virtual. This apparently has some advantages, but again, when people say that type erasure gives us runtime polymorphism without virtual functions, they are talking about this interface. We will still be using virtual functions and a vtable under the hood here.

The trick that makes type erasure work is the template constructor. This is the only point were we know the concrete type of the thing that is getting stored. We construct a Wrapper<T> object and copy the parameter t in. Wrapper<T> knows how to forward calls to T, so that the method calls we make through Interface* will be sent to the right place. But no other part of our class knows about the type T. Most importantly, the type of Any does not depend on T, so we can store multiple instances of Any in a single vector, and each one can store a different T. That is the type erasure part.

Final Product

With these three classes defined, Interface, Wrapper, and Any, we can now store (and use) any object that provides the getOpt1() and setOpt1(int) methods.

class A
{
 private:
  int m_Opt1 = 1;

 public:
  int  getOpt1() const { return m_Opt1; }
  void setOpt1(int a) { m_Opt1 = a; }
};

class B
{
 private:
  int m_Opt1;

 public:
  B() = default;
  B(int a):m_Opt1(a){}
  int  getOpt1() const { return m_Opt1; }
  void setOpt1(int a) { m_Opt1 = a; }
};

int main()
{
  Any any = Wrapper<A>();
  std::cout << any.getOpt1() << std::endl; // output: 1
  any = Wrapper<B>(10);
  std::cout << any.getOpt1() << std::endl; // output: 10

  return 0;
}

Discussion

The two goals that are usually cited as motivating the type erasure pattern are the issues I listed above:

  • We want non-intrusive runtime polymorphism. If we use the classic inheritance pattern, the class inheritance tree will permeate the entire code base. I want to to write separate, independent, testable classes.
  • We want value semantics. If we use the classic pattern, we are going to have to new and delete everything we want to dispatch at runtime and copies will be pointer copies. Using a smart pointer will help, but we will still have pointer syntax and shallow copies. I want something that looks like a normal stack object (even though it will allocate on the heap).

As we saw above, the first issue can be addressed with simple wrapper class that brings outside types into an inheritance tree. If your team prefers to work with raw pointers (I'm sure they have their reasons), then a wrapper class template like Wrapper<T> is all that is needed. You can even use a wrapper class to bring outside type into an existing inheritance tree. If you already have a top-level base class like Interface, then you can simply create a Wrapper<T> class template that inherits from it instead.

The type erasure "trick" is using a function template constructor in a non-template class. The type is known at the point that the object is constructed, but then erased by storing it in a class that does not depend on this type. The implementation I have shown is fairly simple and straight forward, it is very similar to the one Arthur O'Dwyer shows, except that I have derived from T in Wrapper<T> instead of using it as a member. The nice thing about inheriting from T is that we forward the constructor and assignment operators automatically.

Other common implementations put both Interafce and Wrapper<T> inside Any as nested classes. See Zach Laine's talk for example (The 17:11 mark shows a slide of his code), or Sean Parent's "Better Code" talk. Both Zach and Sean go into great detail on designing the type erasing container, and a lot of the discussion is around construction, making sure that you do the right thing when you construct a new Any, move rather than copy when you can, etc. I'm certain their designs are much more robust, and will cover many corner cases. But for simple cases where I just want to switch between multiple types at runtime, this works pretty well.

I like the method shown above because it can be integrated into existing code bases in parts. If you already have an Interface class that all of your types derive from, then you can use Wrapper<T> to wrap third-party classes and use them in your code that expects an Interface*.

Making Improvements

Constructing/Initializing

There are a few improvements that we can make to the basic implementation above. It would be nice if we could construct or assign to Any with types we want to store directly, rather than using Wrapper<T>

Any any = A();
...
A   a;
Any any(a);

Adding two more constructors will allow this:

class Any
{
...
  template<typename T>
  Any(const T& t) : m_storage(new Wrapper<T>(t))
  {
  }
  template<typename T>
  Any(const T&& t) : m_storage(new Wrapper<T>(t))
  {
...
}

Providing Missing Method Implementations or Overriding Existing Implementations

The Boost.TypeErasure library calls our Interface a concept, each concrete class is said to model the concept, and our Wrapper<T> class is the equivalent of a concept map. Boost supports user-defined concept maps that allow the user to "non-intrusively adapt a type to model a concept". We can do the same here. If we have a type that does not exactly match the interface (perhaps it uses a different naming convention), we can specialize Wrapper<T> for that type and provide the mapping. Consider a class C that provides get and set for Option1 instead of Opt1

class C
{
 private:
  int m_Option1 = 3;

 public:
  C() = default;
  C(int a) : m_Option1(a) {}
  int  getOption1() const { return m_Option1; }
  void setOption1(int a) { m_Option1 = a; }
};

We can provide an implementation for Wrapper<C> that maps calls from the Interface methods to C's methods

template<>
class Wrapper<C> : public Interface, C
{
 public:
  using C::C;
  using C::operator=;
  Wrapper(const C& t) : C(t) {}
  Wrapper(const C&& t) : C(t) {}

  virtual int  getOpt1() const final { return static_cast<const C*>(this)->getOption1(); }
  virtual void setOpt1(int a) final { static_cast<C*>(this)->setOption1(a); }
};

This is OK, and it will work, but it requires us to provide the entire implementation for Wrapper<C>. We can't just provide an implementation for getOpt1() for example.

What if we wanted to have the ability to provide an override for each method in the interface, but use a default if no override was provided? We can do this with some template magic and SFINAE. With decltype(), we can use expression SFINAE to detect if an object has a given method, or if a function can be called with a given argument. Here is the implementation:

template<size_t N>
struct priority : priority<N-1>{};
template<>
struct priority<0>{};

template<typename T>
class Wrapper : public Interface, T
{
 public:
  using T::T;
  using T::operator=;
  Wrapper(const T& t) : T(t) {}
  Wrapper(const T&& t) : T(t) {}

  virtual int  getOpt1() const final{ return _getOpt1(priority<1>{}); }
  virtual void setOpt1(int a) final { _setOpt1(a,priority<1>{}); }

 protected:
  template<typename TT = T>
  auto _getOpt1(priority<0>) const -> decltype(static_cast<const TT*>(this)->getOpt1()) { return static_cast<const TT*>(this)->getOpt1(); }
  template<typename TT = T>
  auto _getOpt1(priority<1>) const -> decltype(map_getOpt1(*static_cast<const TT*>(this))) { return map_getOpt1(*static_cast<const TT*>(this)); }

  template<typename TT = T>
  auto _setOpt1(int a,priority<0>) -> decltype(static_cast<TT*>(this)->setOpt1(a)) { static_cast<T*>(this)->setOpt1(a); }
  template<typename TT = T>
  auto _setOpt1(int a,priority<1>) -> decltype(map_setOpt1(*static_cast<TT*>(this), a)) {map_setOpt1(*static_cast<TT*>(this), a);}
};

Using SFINAE requires a function template, but virtual functions cannot be templates. So we create a private function template and forward the virtual function calls there. In our function template, we use an auto return type with -> decltype() to invoke expression SFINAE. If the expression inside decltype will compile, then the function is considered for overload resolution. Otherwise, it is not. In order for expression SFINAE to work, the expression inside decltype has to depend on a template parameter of the function template. So we have to declare a function template parameter that will be used, but we just default it to the class template parameter.

The priority<size_t> class is a nifty little utility that allows us to prioritize implementations by tagging them. The first time I saw it was in a talk by Arthur O'Dwyer. He also mentions it in this blog post. We use this tag class to provide two implementations. priority<1> (the higher priority) function template looks to see if the wrapped type can be passed to a map_* free function. priority<0> just forwards the calls to member functions like before. Note, if we did not use this priority tag and both versions of the function template were valid, we would get a compiler error saying the call was ambiguous.

Now we can provide a missing implementation, or override an existing one, by writing free functions. In the example with class C, we could use this to map C's interface to Interface, instead of specializing Wrapper<T>.

void map_setOpt1( C& c, int a) { c.setOption1(a); }
int map_getOpt1( const C& c) { return c.getOption1(); }

We can even use function templates. The code below will forward calls to getOption1() and setOption1() for any type that has them.

template<typename T>
auto map_setOpt1( T& c, int a) -> decltype(c.setOption1(a), void()) { c.setOption1(a); }
template<typename T>
auto map_getOpt1( const T& c) -> decltype(c.getOption1()) { return c.getOption1(); }

Copying

The implementation presented above does not allow Any objects to be copied, since std::unique_ptr<T> cannot be copied. We could use a std::shared_ptr, but then we would introduce reference semantics. Copies of an Any would not be independent, modifying one would modify them all.

If we want to be able to copy Any, then we need to write a copy constructor. But the copy constructor will not know the type of the object pointed to by m_storage, it can only make calls to methods defined by Interface. So, we will have to add a virtual copy method to the Interface class.

class Interface
{
...
  virtual std::unique_ptr<Interface> copy() const = 0;
...
};

Here we return a std::unique_ptr<Interface> instead of Interface* to avoid leaking memory (using std::unique_ptr<Interface> communicates that the caller is taking ownership, and if they don't the object will be destroyed automatically). Next, we add a copy() method to Wrapper<T>

template<typename T>
class Wrapper
{
...
  virtual std::unique_ptr<Interface> copy() const {return std::unique_ptr<Interface>(new Wrapper<T>(*static_cast<const T*>(this)));}
...
};

Here we allocate a new Wrapper<T> from *this and return a std::unique_ptr to it. This will require T to be copyable.

Finally, we provide a copy constructor for Any. We will probably want an assignment operator as well.

Any(const Any& a): m_storage( a.m_storage->copy() )
{ }
Any& operator=(const Any& a)
{
  m_storage = a.m_storage->copy();
  return *this;
}

Now we can copy Any objects

Any any1 = A();
Any any2 = any1; // would not work before

Final Thoughts

Type erasure is a great method for getting runtime polymorphism with value-like semantics, and it allows us to separate concerns. Rather than having to remember to derive from a common base and implement all of the required methods to allow runtime polymorphism, we can just focus on what the class is supposed to do. Then, we design a type-erasing container than handles all of the runtime polymorphism requirements.

Using a wrapper class that is outside the type eraser, we can get runtime polymorphism with classes that do not inherit from a common base, which means we can write classes that only use inheritance to reuse code, and add runtime polymorphism later.

There are libraries for creating type erasing classes, such as Boost.TypeErasure. They handle all of the details about initializing object correctly, copying them correctly, and reduce the boiler plate code (if we want to add a new method to our interface, we have to add it to three classes). But these libraries require you to use a DSL and can be difficult to understand for newcomers (I know I struggled with it). If you just want to provide a type erasing container for your own library, writing an interface, wrapper, and eraser class may be simpler.

links

social