Small-Object Optimization applied to std::variant

This article dealing with Small-Object Optimization comes from a discussion with a colleague of mine, Florian Carnac. We discussed std::variant and the fact that if you use it with types of different sizes, you can get some memory wasted.

The description of the problem

std::variant<Ts...> is a type-safe union. It means its value is one of the Ts... at a given time. Reversely, std::tuple<Ts...> contains all Ts... simultaneously. It means that if you have std::variant<std::string, int, double>, you have either one std::string, or one int or one double.

The size of a variant (e.g. sizeof(std::variant<Ts...>)), is close to std::max({sizeof(Ts)...}) + sizeof(Index). It means that if you have ten types, the variant’s size will depend on the size of the bigger type.

Let’s say you have nine types that take 8 bytes in memory and one type that takes 120 bytes in memory. The variant size will be around 128 (120 + 8). Each time you use this variant, you will use 128 bytes of memory, even if you use one of the nine tiny types.

Imagine you want to use an array of this variant. Each cell of your array will take 128 bytes of memory, which means you will waste much memory each time you don’t use the more significant type.

Small-Object Optimization to tackle the problem

From our discussion, here is what would be great: if we could construct a big object on the heap and build the tinies objects on the stack!

We introduce soo trait.

template<typename ...Ts>
class soo_variant {
    std::variant<soo<24>::apply<Ts>...> m_variant;
};

struct BigType {
    std::array<std::byte, 120> m_object;
};

int main() {
    using Variant = soo_variant<int, double, BigType>;
    static_assert(sizeof(Variant) <= 32); // size max + size of index = 24 + 8 = 32
}

Let’s implement soo. The interface is straightforward. It takes an int template parameter representing the maximum size of the object. Each type provided by the apply meta-function has a size equal to or less than that provided to soo.

static_assert(sizeof(soo<24>::apply<double>) <= 24);
static_assert(sizeof(soo<24>::apply<BigType>) <= 24);

The apply function will return two types:

  1. One type handling small objects
  2. One type handling big objects
template <int Size> struct soo {
    // if Size is great enough, we can construct directly on the stack
    template <typename T, bool OnStack = Size >= sizeof(T)> struct implementation;

    template <typename T> struct implementation<T, true> {}
    template <typename T> struct implementation<T, false> {}

    template <typename T> using apply = implementation<T>;
};

Now, let’s implement the first implementation, the one handling stack-allocated objects. It is straightforward.

template <typename T> class implementation<T, true> {
    T m_object;
public:
    implementation(T object) : m_object{std::move(object)} { }

    T &get() & { return m_object; } 
    const T &get() const & { return m_object; } 
    T get() && { return std::move(m_object); } 
};

And here is the implementation for heap-allocated objects:

    // require heap allocation
template <typename T> class implementation<T, false> {
    static_assert(sizeof(T*) <= Size);
    std::unique_ptr<T> m_object;

public:
    implementation(T object)
        : m_object{std::make_unique<T>(std::move(object))} {}

    implementation(const implementation &a) : m_object{std::make_unique<T>(*a.m_object)} {}
    implementation(implementation &&a) noexcept : m_object{std::move(a.m_object)} {}

    implementation &operator=(const implementation &a) {
        m_object = std::make_unique<T>(*a.m_object);
        return *this;
    }

    implementation &operator=(implementation &&a) {
        m_object = std::move(a.m_object);
        return *this;
    }

    T &get() & { return *m_object; } 
    const T &get() const & { return *m_object; } 
    T get() && { return std::move(*m_object); } 
};

Now, you need to write the body of soo_variant:

template<typename ...Ts>
class soo_variant{
public:
    template<typename T>
    soo_variant(T x) : m_variant{soo<24>::apply<T>{std::move(x)}}{}

    template<typename Self, typename Visitor>
    decltype(auto) visit(this Self &&self, Visitor &&visitor) {
        auto apply_visitor = [&](auto &&x) {
            return std::invoke(std::forward<Visitor>(visitor), std::forward<decltype(x)>(x).get());
        };
        return std::visit(apply_visitor, std::forward<Self>(self).m_variant);
    }

private:
    std::variant<soo<24>::apply<Ts>...> m_variant;
};

You have a basic constructor forwarding the type to the according soo::implementation. The visit function retrieves a soo::implementation and call the get function.

Conclusion

Small-Object Optimization allows us to tackle the heterogenous-sized type problem for variants. However, it has also some drawbacks:

  1. It makes it difficult to track whether a type is dynamically allocated at first glance.
  2. You cannot easily make an alias of std::variant, and makes the std::visit function ill-formed

Next time, we will discuss about “more” run-time Small-Object optimization (which is closer to what Erased uses internally).

If you want to test the code: click here.

Thanks for reading!

Comments

2 responses to “Small-Object Optimization applied to std::variant”

  1. Dirk Avatar
    Dirk

    Hello Antoine, this all looks very interesting. I started a very similar project a few months ago. Although I have reached a point where I was considering a fundamental redesign of the API, and at the moment I don’t really have the time to continue development, perhaps the project might still be useful to you? https://github.com/VolumeGraphics/traits

    1. Antoine MORRIER Avatar
      Antoine MORRIER

      Hello Dirk!
      I am going to check it more deeply with the few days.
      Thanks a lot for sharing

Leave a Reply