This article dealing with Small-Object Optimization comes from a discussion with a colleague of mine, Florian Carnac. We discussed std::variant
and the fact that if you use it with types of different sizes, you can get some memory wasted.
The description of the problem
std::variant<Ts...>
is a type-safe union. It means its value is one of the Ts...
at a given time. Reversely, std::tuple<Ts...>
contains all Ts...
simultaneously. It means that if you have std::variant<std::string, int, double>
, you have either one std::string
, or one int
or one double
.
The size of a variant (e.g. sizeof(std::variant<Ts...>)
), is close to std::max({sizeof(Ts)...}) + sizeof(Index)
. It means that if you have ten types, the variant’s size will depend on the size of the bigger type.
Let’s say you have nine types that take 8 bytes in memory and one type that takes 120 bytes in memory. The variant size will be around 128 (120 + 8). Each time you use this variant, you will use 128 bytes of memory, even if you use one of the nine tiny types.
Imagine you want to use an array of this variant. Each cell of your array will take 128 bytes of memory, which means you will waste much memory each time you don’t use the more significant type.
Small-Object Optimization to tackle the problem
From our discussion, here is what would be great: if we could construct a big object on the heap and build the tinies objects on the stack!
We introduce soo
trait.
template<typename ...Ts>
class soo_variant {
std::variant<soo<24>::apply<Ts>...> m_variant;
};
struct BigType {
std::array<std::byte, 120> m_object;
};
int main() {
using Variant = soo_variant<int, double, BigType>;
static_assert(sizeof(Variant) <= 32); // size max + size of index = 24 + 8 = 32
}
Let’s implement soo
. The interface is straightforward. It takes an int
template parameter representing the maximum size of the object. Each type provided by the apply
meta-function has a size equal to or less than that provided to soo.
static_assert(sizeof(soo<24>::apply<double>) <= 24);
static_assert(sizeof(soo<24>::apply<BigType>) <= 24);
The apply
function will return two types:
- One type handling small objects
- One type handling big objects
template <int Size> struct soo {
// if Size is great enough, we can construct directly on the stack
template <typename T, bool OnStack = Size >= sizeof(T)> struct implementation;
template <typename T> struct implementation<T, true> {}
template <typename T> struct implementation<T, false> {}
template <typename T> using apply = implementation<T>;
};
Now, let’s implement the first implementation, the one handling stack-allocated objects. It is straightforward.
template <typename T> class implementation<T, true> {
T m_object;
public:
implementation(T object) : m_object{std::move(object)} { }
T &get() & { return m_object; }
const T &get() const & { return m_object; }
T get() && { return std::move(m_object); }
};
And here is the implementation for heap-allocated objects:
// require heap allocation
template <typename T> class implementation<T, false> {
static_assert(sizeof(T*) <= Size);
std::unique_ptr<T> m_object;
public:
implementation(T object)
: m_object{std::make_unique<T>(std::move(object))} {}
implementation(const implementation &a) : m_object{std::make_unique<T>(*a.m_object)} {}
implementation(implementation &&a) noexcept : m_object{std::move(a.m_object)} {}
implementation &operator=(const implementation &a) {
m_object = std::make_unique<T>(*a.m_object);
return *this;
}
implementation &operator=(implementation &&a) {
m_object = std::move(a.m_object);
return *this;
}
T &get() & { return *m_object; }
const T &get() const & { return *m_object; }
T get() && { return std::move(*m_object); }
};
Now, you need to write the body of soo_variant
:
template<typename ...Ts>
class soo_variant{
public:
template<typename T>
soo_variant(T x) : m_variant{soo<24>::apply<T>{std::move(x)}}{}
template<typename Self, typename Visitor>
decltype(auto) visit(this Self &&self, Visitor &&visitor) {
auto apply_visitor = [&](auto &&x) {
return std::invoke(std::forward<Visitor>(visitor), std::forward<decltype(x)>(x).get());
};
return std::visit(apply_visitor, std::forward<Self>(self).m_variant);
}
private:
std::variant<soo<24>::apply<Ts>...> m_variant;
};
You have a basic constructor forwarding the type to the according soo::implementation
. The visit function retrieves a soo::implementation
and call the get
function.
Conclusion
Small-Object Optimization allows us to tackle the heterogenous-sized type problem for variants. However, it has also some drawbacks:
- It makes it difficult to track whether a type is dynamically allocated at first glance.
- You cannot easily make an alias of
std::variant
, and makes thestd::visit
function ill-formed
Next time, we will discuss about “more” run-time Small-Object optimization (which is closer to what Erased uses internally).
If you want to test the code: click here.
Thanks for reading!
Leave a Reply