C++ memory safety, the static way
As time flies, C++ memory safety is more and more important. My last article was about dangling references. The problem is the check is done only at runtime. Today, we are going to see some obscure metaprogramming things related to friend injection. That will allow us to make safe containers.
I want to warn people that this code is more a proof of concept than everything because as far as I recall, the standard committee wants to avoid making such things possible :).
I also want to say, that even if the idea is quite old in my mind, doing it in an “elegant” way is inspired a lot by Mitzi.
Here is the objective we want to reach. A safe pointer
pointer<int> ptr{new int{5}};
*ptr = 3;
ptr.reset();
//*ptr = 5; // not compile
pointer<int> ptr2{nullptr};
//*ptr2 = 18; // not compile
Recall of Argument Dependant Lookup
Argument Dependant Lookup allows people to use functions declared in a namespace without knowing they are in such a namespace. An example could be the Hello World example std::cout << "Hello World"
. The ostream &operator<<(ostream &, const char*);
function is defined within the std
namespace, but since cout
is one type declared within the std
namespace, the compiler can find the operator easily through ADL.
A second example could be :
namespace A {
struct B{};
void f(B) {}
}
f(A::B{}); // unqualified call => found through ADL
A::f(A::B{}); // qualified call
Recall of friend injection
Friend injection is well described on the internet, but basically, friend
can be used for two things :
- Access private or protected members of a class
- Inject a function into the innermost namespace surrounding the class so that this function is reachable from Argument Dependant Lookup (ADL)
The second point is, I think, the less known fact.
namespace A {
struct B {
friend void f(B) {}
};
}
A::f(A::B{}); // Does not compile
f(A::B{}); // Only way to get it is through ADL
Since they are not function class member but real functions, you can also declare them in one structure and define them in another one.
struct A {
friend void f(A);
}
struct B {
friend void f(A) {};
};
f(A{}); // Call the definition in B
Check if a function is defined within a template context
When a template instantiation occurs for a class, all the non-template friend functions get instantiated as well.
To know if a function is defined, you can use a requires
clause with a not-defined function that returns auto
.
auto f(int); // Declare your function
template<int X> // Define the function when one template is instantiated
struct DefineTheFunction{
friend auto f(int) {}
};
template<auto X = 0, bool B = requires{f(X);}>
constexpr auto g() {
return B; // return true if f is defined
};
// DefineTheFunction<0> _;
int main() {
static_assert(g() == false);
}
If you uncomment the line instantiating the template, the static_assert
will fire.
What is even more amazing is that if you instantiate the template between two calls to g()
, you will get two different results!
int main() {
static_assert(g() == false);
DefineTheFunction<0> _;
static_assert(g() == true);
}
I think it is as atrocious as beautiful :D.
Declaring a Context
To make our objective work as expected, it is wanted that the two pointers are not exactly of the same type. Also, the two calls to the g()
functions we had before are not the same ones.
Since C++20, we can declare such context using auto Context = []{}
.
template<typename T, auto Context = []{}>
struct pointer {
// black magic
};
pointer<int> p1; // pointer<int, Context1>
pointer<int> p2; // pointer<int, Context2>
The Context associated with the pointer will get modified each time you call a function of this pointer. The other one will not be impacted because they are two different types.
Be able to produce more than two values
The idea here is simple, you create a Type<0>
. If the function for this type is defined, let’s try the function for Type<1>
and so on. When you get an undefined function, it means you need to instantiate it to generate a new value. You can implement a Counter with such an approach.
template<int Value>
struct InjectedValue {
static constexpr int value = Value;
constexpr operator int() const noexcept { return Value; }
friend constexpr auto injected(InjectedValue<Value>);
};
template<auto evaluation, int Value = 0>
constexpr auto getNextInjectedValue() {
constexpr auto injectedValue = InjectedValue<Value>{};
constexpr bool isInjected = requires {injected(injectedValue);};
if constexpr (isInjected) {
return getNextInjectedValue<evaluation, Value + 1>();
}
else {
return injectedValue;
}
}
Here the idea is simple, we create an InjectedValue<int>
. It has a declaration injected
function, not the definition. We define a getNextInjectedValue
which will iterate until it does not find the definition for the given function. It returns the value for which there is no injected definition available. The evaluation is needed here, or else it will return always the same value because of memoization. Let’s implement a Counter
now :).
template<int Value>
struct CounterInjector {
friend constexpr auto injected(InjectedValue<Value>) {}
};
struct Counter {
template<auto evaluation = []{}>
static constexpr int next() {
constexpr auto toInject = getNextInjectedValue<evaluation>();
CounterInjector<toInject> _{};
return toInject;
}
};
We create a CounterInjector
which injects the definition. The next function is easy, we get the next value to inject representing the value of the current counter. We then inject a definition for this value (then the next call will return value + 1). And we return the current value. The evaluation
here is to have a different definition of next()
function at each call. Now let’s test it!
int main() {
static_assert(Counter::next() == 0);
static_assert(Counter::next() == 1);
static_assert(Counter::next() == 2);
}
Amazing !
The problem is we can have only one counter. Let’s add a Context
template parameter for each object!
template<int Value, auto Context>
struct InjectedValue {
static constexpr int value = Value;
constexpr operator int() const noexcept { return Value; }
friend constexpr auto injected(InjectedValue<Value, Context>);
};
template<auto evaluation, auto Context, int Value = 0>
constexpr auto getNextInjectedValue() {
constexpr auto injectedValue = InjectedValue<Value, Context>{};
constexpr bool isInjected = requires {injected(injectedValue);};
if constexpr (isInjected) {
return getNextInjectedValue<evaluation, Context, Value + 1>();
}
else {
return injectedValue;
}
}
template<int Value, auto Context>
struct CounterInjector {
friend constexpr auto injected(InjectedValue<Value, Context>) {}
};
template<auto Context = []{}>
struct Counter {
template<auto evaluation = []{}>
static constexpr int next() {
constexpr auto toInject = getNextInjectedValue<evaluation, Context>();
CounterInjector<toInject, Context> _{};
return toInject;
}
};
And the testing:
int main() {
using C1 = Counter<>;
using C2 = Counter<>;
static_assert(C1::next() == 0);
static_assert(C1::next() == 1);
static_assert(C1::next() == 2);
static_assert(C2::next() == 0);
static_assert(C2::next() == 1);
static_assert(C2::next() == 2);
}
Perfect, we reach our goal.
Let’s write the safe_pointer<T>
The first step for C++ memory safety is to avoid dangling references. Pointers are one of the biggest sources of bugs in C++. For this pointer, I propose 3 states.
- Initialized: we know at the compilation that the pointer is initialized
- Destroyed: we know at the compilation that the pointer is not initialized
- Unknown: we don’t know, at the compile time, if the pointer is initialized or not.
For the sake of simplicity, I propose to don’t care about the third case for now.
struct InitializedPointer {};
struct NullPointer {};
template<typename T, auto Context = []{}>
struct safe_pointer {
public:
safe_pointer() : m_ptr{nullptr} {
// set state to NullPointer
}
safe_pointer(decltype(nullptr)) : m_ptr{nullptr}{
// set state to NullPointer
}
safe_pointer(T *ptr) : m_ptr{ptr} {
// set state to InitializedPointer
}
~safe_pointer() { delete m_ptr; }
template<auto evaluation = []{}>
T &operator*() {
// static_assert(state is InitializedPointer)
return *m_ptr;
}
template<auto evaluation = []{}>
void reset() {
// set state to NullPointer
delete m_ptr;
m_ptr = nullptr;
}
private:
T *m_ptr;
};
Once we got here, we understand that we need a State
object that is modifiable at compile time.
Let’s design it !
template<typename T>
struct State {
using type = T;
};
template<auto evaluation, auto Context>
constexpr auto getLastInjectedValue() {
constexpr auto nextValueToInject = getNextInjectedValue<evaluation, Context>();
return InjectedValue<nextValueToInject - 1, Context>{};
}
template<typename T, int Value, auto Context>
struct StateInjector {
friend constexpr auto injected(InjectedValue<Value, Context>) {
return State<T>{};
}
};
template<typename First, auto Context = []{}>
struct MetaState {
static constexpr auto context = Context;
static constexpr auto first = StateInjector<First, 0, context>{};
template<auto evaluation = []{}>
using get = typename decltype(injected(getLastInjectedValue<evaluation, context>()))::type;
template<typename T, auto evaluation = []{}>
static constexpr auto set() {
constexpr auto toInject = getNextInjectedValue<evaluation, context>();
return StateInjector<T, toInject, context> {};
}
};
We begin to create a State
object that just owns a type. After we create a function returning the last injected value. We introduced a StateInjector
. The function has a definition, but what is new is that it returns the T
wrapped-over State<T>
. It will allow client calls to get the type through ADL! It’s a little black magic!
The MetaState<First>
proposed 2 functions, get
who returns the latest set value, and set
injects into the Context
another type.
With a little test:
using MS = MetaState<int>;
static_assert(std::is_same_v<MS::get<>, int>);
MS::set<double>();
static_assert(std::is_same_v<MS::get<>, double>);
MS::set<char>();
static_assert(std::is_same_v<MS::get<>, char>);
Let’s complete the safe_pointer
now !
template<typename T, auto Context = []{}>
struct safe_pointer {
using state = MetaState<State<void>, Context>;
public:
safe_pointer() : m_ptr{nullptr} {
state::template set<NullPointer>();
}
safe_pointer(decltype(nullptr)) : m_ptr{nullptr}{
state::template set<NullPointer>();
}
safe_pointer(T *ptr) : m_ptr{ptr} {
state::template set<InitializedPointer>();
}
~safe_pointer() { delete m_ptr; }
template<auto evaluation = []{}>
T &operator*() {
using current = state::template get<>;
static_assert(std::is_same_v<current, InitializedPointer>);
return *m_ptr;
}
template<auto evaluation = []{}>
void reset() {
state::template set<NullPointer>();
delete m_ptr;
m_ptr = nullptr;
}
private:
T *m_ptr;
};
And the little test as usual !
int main() {
safe_pointer<int> p1{new int};
*p1 = 41;
p1.reset();
*p1 = 53; // Don't compile
safe_pointer<int> p2{nullptr};
safe_pointer<int> p3;
*p2 = 20; // don't compile
*p3 = 43; // don't compile
}
Et voilà!
Here is the full link implementation: Full Implementation on wandbox
Conclusion
I hope this article pleased you. If you are interested, we can see more in detail some other things like safe_optional
, safe_vector
, safe_ref
.
We can also see if we are able to manage ref
counting with exclusive access.
But be aware that these techniques may not be suitable for production code ;).
See you!
Thanks to Patrice Espie for the review :).