C++ Memory Safety: Safe Pointer

,

C++ memory safety, the static way

As time flies, C++ memory safety is more and more important. My last article was about dangling references. The problem is the check is done only at runtime. Today, we are going to see some obscure metaprogramming things related to friend injection. That will allow us to make safe containers.

I want to warn people that this code is more a proof of concept than everything because as far as I recall, the standard committee wants to avoid making such things possible :).
I also want to say, that even if the idea is quite old in my mind, doing it in an “elegant” way is inspired a lot by Mitzi.

Here is the objective we want to reach. A safe pointer

    pointer<int> ptr{new int{5}};
    *ptr = 3;
    ptr.reset();
    //*ptr = 5; // not compile

    pointer<int> ptr2{nullptr};
    //*ptr2 = 18; // not compile

Recall of Argument Dependant Lookup

Argument Dependant Lookup allows people to use functions declared in a namespace without knowing they are in such a namespace. An example could be the Hello World example std::cout << "Hello World". The ostream &operator<<(ostream &, const char*); function is defined within the std namespace, but since cout is one type declared within the std namespace, the compiler can find the operator easily through ADL.

A second example could be :

namespace A {
    struct B{};
    void f(B) {}
}

f(A::B{}); // unqualified call => found through ADL
A::f(A::B{}); // qualified call

Recall of friend injection

Friend injection is well described on the internet, but basically, friend can be used for two things :

  1. Access private or protected members of a class
  2. Inject a function into the innermost namespace surrounding the class so that this function is reachable from Argument Dependant Lookup (ADL)

The second point is, I think, the less known fact.

namespace A {
    struct B {
        friend void f(B) {}
    };
}

A::f(A::B{}); // Does not compile
f(A::B{}); // Only way to get it is through ADL

Since they are not function class member but real functions, you can also declare them in one structure and define them in another one.

struct A {
    friend void f(A);
}

struct B {
    friend void f(A) {};
};

f(A{}); // Call the definition in B

Check if a function is defined within a template context

When a template instantiation occurs for a class, all the non-template friend functions get instantiated as well.

To know if a function is defined, you can use a requires clause with a not-defined function that returns auto.

auto f(int); // Declare your function

template<int X> // Define the function when one template is instantiated
struct DefineTheFunction{
    friend auto f(int) {}
};

template<auto X = 0, bool B = requires{f(X);}>
constexpr auto g() {
    return B; // return true if f is defined
};

// DefineTheFunction<0> _;

int main() {
    static_assert(g() == false);
}

If you uncomment the line instantiating the template, the static_assert will fire.

What is even more amazing is that if you instantiate the template between two calls to g(), you will get two different results!

int main() {
    static_assert(g() == false);
    DefineTheFunction<0> _;
    static_assert(g() == true);
}

I think it is as atrocious as beautiful :D.

Declaring a Context

To make our objective work as expected, it is wanted that the two pointers are not exactly of the same type. Also, the two calls to the g() functions we had before are not the same ones.

Since C++20, we can declare such context using auto Context = []{}.

template<typename T, auto Context = []{}>
struct pointer {
    // black magic
};

pointer<int> p1; // pointer<int, Context1>
pointer<int> p2; // pointer<int, Context2>

The Context associated with the pointer will get modified each time you call a function of this pointer. The other one will not be impacted because they are two different types.

Be able to produce more than two values

The idea here is simple, you create a Type<0>. If the function for this type is defined, let’s try the function for Type<1> and so on. When you get an undefined function, it means you need to instantiate it to generate a new value. You can implement a Counter with such an approach.

template<int Value>
struct InjectedValue {
    static constexpr int value = Value;
    constexpr operator int() const noexcept { return Value; }
    friend constexpr auto injected(InjectedValue<Value>);
};

template<auto evaluation, int Value = 0>
constexpr auto getNextInjectedValue() {
    constexpr auto injectedValue = InjectedValue<Value>{}; 
    constexpr bool isInjected = requires {injected(injectedValue);};
    
    if constexpr (isInjected) {
        return getNextInjectedValue<evaluation, Value + 1>();
    }
    else {
        return injectedValue;
    }
}

Here the idea is simple, we create an InjectedValue<int>. It has a declaration injected function, not the definition. We define a getNextInjectedValue which will iterate until it does not find the definition for the given function. It returns the value for which there is no injected definition available. The evaluation is needed here, or else it will return always the same value because of memoization. Let’s implement a Counter now :).

template<int Value>
struct CounterInjector {
    friend constexpr auto injected(InjectedValue<Value>) {}
};

struct Counter {
    template<auto evaluation = []{}>
    static constexpr int next() {
        constexpr auto toInject = getNextInjectedValue<evaluation>();
        CounterInjector<toInject> _{};
        return toInject;
    }
};

We create a CounterInjector which injects the definition. The next function is easy, we get the next value to inject representing the value of the current counter. We then inject a definition for this value (then the next call will return value + 1). And we return the current value. The evaluation here is to have a different definition of next() function at each call. Now let’s test it!

int main() {
    static_assert(Counter::next() == 0);
    static_assert(Counter::next() == 1);
    static_assert(Counter::next() == 2);
}

Amazing !

The problem is we can have only one counter. Let’s add a Context template parameter for each object!

template<int Value, auto Context>
struct InjectedValue {
    static constexpr int value = Value;
    constexpr operator int() const noexcept { return Value; }
    friend constexpr auto injected(InjectedValue<Value, Context>);
};

template<auto evaluation, auto Context, int Value = 0>
constexpr auto getNextInjectedValue() {
    constexpr auto injectedValue = InjectedValue<Value, Context>{}; 
    constexpr bool isInjected = requires {injected(injectedValue);};
    
    if constexpr (isInjected) {
        return getNextInjectedValue<evaluation, Context, Value + 1>();
    }
    else {
        return injectedValue;
    }
}

template<int Value, auto Context>
struct CounterInjector {
    friend constexpr auto injected(InjectedValue<Value, Context>) {}
};

template<auto Context = []{}>
struct Counter {
    template<auto evaluation = []{}>
    static constexpr int next() {
        constexpr auto toInject = getNextInjectedValue<evaluation, Context>();
        CounterInjector<toInject, Context> _{};
        return toInject;
    }
};

And the testing:

int main() {
    using C1 = Counter<>;
    using C2 = Counter<>;
    static_assert(C1::next() == 0);
    static_assert(C1::next() == 1);
    static_assert(C1::next() == 2);

    static_assert(C2::next() == 0);
    static_assert(C2::next() == 1);
    static_assert(C2::next() == 2);
}

Perfect, we reach our goal.

Let’s write the safe_pointer<T>

The first step for C++ memory safety is to avoid dangling references. Pointers are one of the biggest sources of bugs in C++. For this pointer, I propose 3 states.

  1. Initialized: we know at the compilation that the pointer is initialized
  2. Destroyed: we know at the compilation that the pointer is not initialized
  3. Unknown: we don’t know, at the compile time, if the pointer is initialized or not.

For the sake of simplicity, I propose to don’t care about the third case for now.

struct InitializedPointer {};
struct NullPointer {};

template<typename T, auto Context = []{}>
struct safe_pointer {
public:
    safe_pointer() : m_ptr{nullptr} {
        // set state to NullPointer
    }
    safe_pointer(decltype(nullptr)) : m_ptr{nullptr}{
        // set state to NullPointer
    }
    safe_pointer(T *ptr) : m_ptr{ptr} {
        // set state to InitializedPointer
    }
    ~safe_pointer() { delete m_ptr; }

    template<auto evaluation = []{}>
    T &operator*() {
        // static_assert(state is InitializedPointer)
        return *m_ptr;
    }

    template<auto evaluation = []{}>
    void reset() {
        // set state to NullPointer
        delete m_ptr;
        m_ptr = nullptr;
    }
    
private:
    T *m_ptr;
};

Once we got here, we understand that we need a State object that is modifiable at compile time.

Let’s design it !

template<typename T>
struct State {
    using type = T;
};

template<auto evaluation, auto Context>
constexpr auto getLastInjectedValue() {
    constexpr auto nextValueToInject = getNextInjectedValue<evaluation, Context>();
    return InjectedValue<nextValueToInject - 1, Context>{};
}

template<typename T, int Value, auto Context>
struct StateInjector {
    friend constexpr auto injected(InjectedValue<Value, Context>) {
        return State<T>{};
    }
};

template<typename First, auto Context = []{}>
struct MetaState {
    static constexpr auto context = Context;
    static constexpr auto first = StateInjector<First, 0, context>{};

    template<auto evaluation = []{}>
    using get = typename decltype(injected(getLastInjectedValue<evaluation, context>()))::type;

    template<typename T, auto evaluation = []{}>
    static constexpr auto set() {
        constexpr auto toInject = getNextInjectedValue<evaluation, context>();
        return StateInjector<T, toInject, context> {};
    }
};

We begin to create a State object that just owns a type. After we create a function returning the last injected value. We introduced a StateInjector. The function has a definition, but what is new is that it returns the T wrapped-over State<T>. It will allow client calls to get the type through ADL! It’s a little black magic!

The MetaState<First> proposed 2 functions, get who returns the latest set value, and set injects into the Context another type.

With a little test:

    using MS = MetaState<int>;

    static_assert(std::is_same_v<MS::get<>, int>);
    MS::set<double>();    
    static_assert(std::is_same_v<MS::get<>, double>);
    MS::set<char>();
    static_assert(std::is_same_v<MS::get<>, char>);

Let’s complete the safe_pointer now !

template<typename T, auto Context = []{}>
struct safe_pointer {
    using state = MetaState<State<void>, Context>;
public:
    safe_pointer() : m_ptr{nullptr} {
        state::template set<NullPointer>();
    }
    safe_pointer(decltype(nullptr)) : m_ptr{nullptr}{
        state::template set<NullPointer>();
    }
    safe_pointer(T *ptr) : m_ptr{ptr} {
        state::template set<InitializedPointer>();
    }
    ~safe_pointer() { delete m_ptr; }

    template<auto evaluation = []{}>
    T &operator*() {
        using current = state::template get<>;
        static_assert(std::is_same_v<current, InitializedPointer>);
        return *m_ptr;
    }

    template<auto evaluation = []{}>
    void reset() {
        state::template set<NullPointer>();
        delete m_ptr;
        m_ptr = nullptr;
    }
    
private:
    T *m_ptr;
};

And the little test as usual !

int main() {    
    safe_pointer<int> p1{new int};

    *p1 = 41;
    p1.reset();
    *p1 = 53; // Don't compile
    
    safe_pointer<int> p2{nullptr};
    safe_pointer<int> p3;

    *p2 = 20; // don't compile
    *p3 = 43; // don't compile
}

Et voilà!

Here is the full link implementation: Full Implementation on wandbox

Conclusion

I hope this article pleased you. If you are interested, we can see more in detail some other things like safe_optional, safe_vector, safe_ref.
We can also see if we are able to manage ref counting with exclusive access.

But be aware that these techniques may not be suitable for production code ;).

See you!
Thanks to Patrice Espie for the review :).

Comments

4 responses to “C++ Memory Safety: Safe Pointer”

  1. eao197 Avatar

    Hi!

    What’s about this case?

    safe_pointer<int> p1{new int};
    *p1 = 41;
    if(argc > 2) {
    p1.reset();
    }
    else {
    *p1 = 53; // Don't compile
    }

    1. Antoine MORRIER Avatar
      Antoine MORRIER

      Hello !
      Very nice catch and interesting case.
      However, here, we can propose two things:

      1. In this case, we can’t know at compile time the case, so it can be interesting to use unknown tag
      2. But also in such case, maybe `std::unique_ptr`is the way to go
      1. eao197 Avatar

        Another use-case that has to be discussed: passing instances of safe_pointer or storing them as members of data structures. As I understand it’s no easy because every safe_pointer is a unique type. So you can’t just write a function like:

        void consume_pointer(safe_pointer ptr) {…}

        It just won’t be compiled and you have to write:

        template
        void consume_pointer(safe_pointer<int, Ctx> ptr) {…}

        1. Antoine MORRIER Avatar
          Antoine MORRIER

          Yes sure :).
          As you can see, there is no miracle things here :).
          However it seems legit, you can’t have something that check everything, beyond functions or even translation unit unfortunately.
          But I am sure we can explore this world to have some security that we don’t have in “normal” C++ :). It’s the purpose of this series of article :).

Leave a Reply