An attempt to remove dangling pointers in C++

Have you ever had any dangling pointers or references in your application? If so, this article will open a discussion about how to try to remove them from your application.

A bit of Context

As many of you may have heard, during the last months there were some discussions about memory-safe languages and governmental organizations like the NSA or the White House.
I can understand that memory-safe languages are well appreciated by such organizations. I have used a bit the Rust programming language, and yes, it is pleasing to use from a developer’s point of view because a lot of memory flaws are caught directly by the compiler. However, I can also understand that industries don’t want to rewrite all their software from C++ to another language, whether it is Rust, Java, C#, or even force their employees to learn a new language. If they decide to do either option, they will lose productivity and can lose market shares.

Dangling pointers

First, what are dangling pointers? It is a pointer to an invalid memory location and can result in a use-after-free problem for example. It is this specific problem I want to discuss with this article.

int *p_a = nullptr;
{
    int a = 10;
    p_a = &a;
}
*p_a = 12; // a is already destroyed

There is already one well-known solution for this problem: the weak reference. In C++, it is expressed as the pair: shared_ptr and weak_ptr.
The idea is simple: when the owners (shared_ptr) are all destroyed, the weak references (weak_ptr) do not point anymore to the destroyed object, but to a null pointer.

It is a very nice and useful pattern. However, sometimes, we consider that the weak_ptr is always valid if we reach that code and we don’t check if the object is still valid.

std::shared_ptr<int> sp = ...;
std::weak_ptr<int> wp = sp;

...

if(auto sp2 = wp.lock()) {
  use(*sp2); // safe because of the test
}
std::shared_ptr<int> sp = ...;
std::weak_ptr<int> wp = sp;

...

use(*wp.lock()); // fail if sp deleted

Since we are all humans, it can happen that, even if we were sure that there is no problem, a problem may arise and boum, a vulnerability can be exploited.

Introducing not_dangling and ref objects.

The idea I want to share is to make it impossible to have weak pointers on a resource you are going to destroy. To be simple, when the object is destroyed, if one or several references are pointing to it, we just call std::terminate. Unfortunately, unlike in Rust, we don’t have the possibility, without doing ugly stateful metaprogramming, to catch such errors at compile time :/.

Here is what I propose to avoid this kind of dangling reference (Obviously, the code is as simple as possible and not intended to be used in production).

class not_dangling {
    template<typename>
    friend class ref;
public:
    ~not_dangling() noexcept(false) {
        if(m_reference_count.load(std::memory_order_relaxed))
            std::terminate();
    }

private:
    mutable std::atomic_int m_reference_count{0};
};

template<typename T>
class ref {
    template<typename>
    friend class ref;
public:
    ref(T &object) noexcept : m_object{object} {
        m_object.m_reference_count.fetch_add(1, std::memory_order::memory_order_relaxed);
    }

    template<typename U>
    ref(const ref<U> &ref) noexcept : m_object{static_cast<T&>(ref.m_object)}{
        m_object.m_reference_count.fetch_add(1, std::memory_order::memory_order_relaxed);
    }

    ref(const ref &ref) noexcept : m_object{ref.m_object} {
        m_object.m_reference_count.fetch_add(1, std::memory_order::memory_order_relaxed);
    }

    T &operator*() { return m_object; }
    T *operator->() { return std::addressof(m_object); }

    ~ref() {
        m_object.m_reference_count.fetch_add(-1, std::memory_order::memory_order_relaxed);
    }
private:
    T &m_object;
};
struct Object : public not_dangling {

};

struct Derived : public Object {

};

int main()
{
    std::optional<Derived> a;
    a.emplace();

    ref<Object> ref_base(*a);
    ref<const Object> ref_base_2(ref_base);
    ref<const Derived> ref_derived(*a);
    ref<const Derived> ref_derived_2(ref_derived);

    return 0;
}

Unfortunately, we must rely on a wrapper to do that, so it is not a “plug-and-play” solution that you can attach to your code directly. The cond’t are:

  • Disable triviality of trivial types because of user-defined destructor
  • May have a little performance overhead
  • It is intrusive: you must derive your object from it…

The first point is, from my experience, not a real problem. I never encountered to have a reference on a type that I needed trivial. If it was, it’s good, but if it was not, I guess I would have never been aware of that :).

The second point is fair, however, even if the performance impact is negligible, sometimes it can be too much. However, we can easily make it an alias. So when your QA, and other developers work, you use the objects we just discussed about, and when you make the build for the commercial version, just use the passthrough alias.

The latest point is because of inheritance. If you have a better idea, please tell me :).

template<typename T>
using not_dangling = std::conditional_t<EnableNotDanling, not_dangling_base, empty_class>;

template<typename T>
using ref = std::conditional_t<EnableNotDangling, ref_wrapper<T>, T*>;

Conclusion

Did you have already faced problems with dangling pointers? If so, what do you think about such an approach? Will you envisage using such objects within your code base, or at least for debugging?

Thanks for reading,

Comments

Leave a Reply