This article will focus mostly on how to simplify hashing in C++. Hashing is a major feature that allows us to use unordered_map, which is the C++ implementation of a HashMap. If you want to hash a value in C++, you can use the std::hash specialization for the type you want to hash.
int main() {
std::cout << std::hash<std::string>{}("My name is Antoine") << std::endl;
}
If you want to make a user-defined type to be hashable, it’s also possible through template specialization of std::hash:
struct Person {
std::string name;
};
template<>
struct std::hash<Person> {
std::size_t operator()(const Person& person) const noexcept {
return std::hash<std::string>{}(person.name);
}
};
That’s nice and all, however, we can easily detect three caveats here:
- The syntax with
std::hashis painful. Something close tohash(a)should be a better approach - The syntax to
std::hasha user-defined type is even more painful. Adding a simple memberhashfunction (or a free function) should be the way to go. - It’s not possible to
std::hasheasily multiple values, either directly(e.ghash(a, b), or indirectly (e.ghash(vector),hash(tuple)).
We are going to see how to simplify this step-by-step. The final objective would be to be able to write something close to:
class Person {
public:
std::size_t hash() const noexcept { return ::hash(m_age, m_name); }
private:
int m_age;
std::string m_name;
};
Here, Person becomes magically hashable and can be used directly in an unordered_map or an unordered_set
Simplify hashing in C++: syntax
We want to be able to write something close to: hash(value). To do that, we need to do different things:
- We need to know if a type is hashable through
std::hash - We need to create a function forwarding the value to the good specialization
// library side
template<typename T>
concept is_std_hashable = requires(const T& object) {
{std::hash<T>{}(object)} -> std::convertible_to<std::size_t>;
};
template<typename T>
std::size_t hash(const T& object) {
static_assert(is_std_hashable<T>);
return std::hash<T>{}(object);
}
// User side
int main() {
std::cout << hash("My name is Antoine"sv) << std::endl;
}
The first step has been resolved successfully.
Simplify the hashing of a user-defined type
What we want to do here is to make a type hashable only by adding a member function hash returning a std::size_t
Again, to be able to do that, we need to follow several steps:
- Adding a concept
has_member_hash_function - Adding a concept
hashable - Rewrite the
hashfunction
// library side
template<typename T>
concept has_hash_member_function = requires(const T& object) {
{object.hash()} -> std::convertible_to<std::size_t>;
};
template<typename T>
concept hashable = is_std_hashable<T> || has_hash_member_function<T>;
template<hashable T>
std::size_t hash(const T& object) {
if constexpr(is_std_hashable<T>)
return std::hash<T>{}(object);
else
return object.hash();
}
// user side
struct Person {
bool operator==(const Person&) const = default;
std::size_t hash() const noexcept { return ::hash(name); }
std::string name;
int age;
};
int main() {
Person person{"Antoine", 33};
std::cout << hash(person) << std::endl;
}
Adding specialization automatically for std::hash.
Everything is fine; however, even if it is straightforward, we cannot yet have std::unordered_set<Person> it because there is no template specialization. Let’s really simplify hashing in C++!
- We need to add a generic
hashspecialization handling thehashmember function. We can also restrict it to the types which are notis_std_hashable, but usually, the specialization of a concrete type being a better match than the one with concept, it should not be necessary.
template<has_hash_member_function T>
struct std::hash<T>{
std::size_t operator()(const T& object) const noexcept {
return object.hash();
}
};
// user side
struct Person {
bool operator==(const Person&) const = default;
std::size_t hash() const noexcept { return ::hash(name); }
std::string name;
int age;
};
int main() {
std::unordered_set<Person> persons;
persons.emplace("Antoine", 33);
persons.emplace("Cedric", 30);
persons.emplace("Antoine", 30);
persons.emplace("Vasyl", 29);
for(const auto &person: persons)
{
std::println("{}: {}", person.name, person.age);
}
}
Simplify hashing in C++: Adding combination
A simple way to combine hashes is to use the same function as the one in boost. It’s not optimal, but you can easily find a better implementation with better distribution. See the reference part for that
template <class T>
inline void hash_combine(std::size_t& seed, const T& v)
{
std::hash<T> hasher;
seed ^= hasher(v) + 0x9e3779b9 + (seed<<6) + (seed>>2);
}
So, the idea here is that from several values, you hash them using this function:
- Adding this function and call
hashinstead ofstd::hash<T>. - Adding an overload to
hashfunction taking several arguments and callshash_combine.
template <class T>
void hash_combine(std::size_t& seed, const T& v)
{
seed ^= hash(v) + 0x9e3779b9 + (seed<<6) + (seed>>2);
}
template<hashable ...Ts>
std::size_t hash(const Ts& ...objects)
{
std::size_t result = 0;
(hash_combine(result, objects), ...);
return result;
}
// user-side
struct Person {
bool operator==(const Person&) const = default;
std::size_t hash() const noexcept { return ::hash(name, age); }
std::string name;
int age;
};
We can further optimize by initializing the result to the hash of the first object.
Simplify hashing in C++ of a range of values
Now, we want to be able to hash a range of values, like std::vector<Person>.
- We need to know if a type is an iterable
- We need to add a specialization of
std::hashfor range values
template<typename T>
concept is_iterable = requires(const T& object) {
object.begin();
object.end();
};
template<is_iterable Iterable>
struct std::hash<Iterable> {
std::size_t operator()(const Iterable& range) const noexcept {
std::size_t result = 0;
for(const auto& object: range)
hash_combine(result, object);
return result;
}
};
Adding specialization for tuple and pair
Again, here is the list of what we need to do:
- Add a pair specialization for
std::hashcallinghash(p.first, p.second) - Add a tuple specialization for
std::hashcalling hash on each object.
template<typename T1, typename T2>
struct std::hash<std::pair<T1, T2>> {
std::size_t operator()(const std::pair<T1, T2>& pair) const noexcept {
return ::hash(pair.first, pair.second);
}
};
template<typename ...Ts>
struct std::hash<std::tuple<Ts...>> {
std::size_t operator()(const std::tuple<Ts...>& tuple) const noexcept {
auto to_apply = [](const auto& ...objects) { return ::hash(objects...); };
return std::apply(to_apply, tuple);
}
};
Conclusion
I hope you liked the article and learnt something. Here is the complete code of our library 😀 .
// Library side
template<typename T>
concept is_std_hashable = requires(const T& object) {
{std::hash<T>{}(object)} -> std::convertible_to<std::size_t>;
};
template<typename T>
concept has_hash_member_function = requires(const T& object) {
{object.hash()} -> std::convertible_to<std::size_t>;
};
template<typename T>
concept hashable = is_std_hashable<T> || has_hash_member_function<T>;
template<typename T>
concept is_iterable = requires(const T& object) {
object.begin();
object.end();
};
template<hashable T>
std::size_t hash(const T& object) {
if constexpr(is_std_hashable<T>)
return std::hash<T>{}(object);
else
return object.hash();
}
template <class T>
void hash_combine(std::size_t& seed, const T& v)
{
seed ^= hash(v) + 0x9e3779b9 + (seed<<6) + (seed>>2);
}
template<hashable ...Ts>
std::size_t hash(const Ts& ...objects)
{
std::size_t result = 0;
(hash_combine(result, objects), ...);
return result;
}
template<has_hash_member_function T>
struct std::hash<T>{
std::size_t operator()(const T& object) const noexcept {
return object.hash();
}
};
template<is_iterable Iterable>
struct std::hash<Iterable> {
std::size_t operator()(const Iterable& range) const noexcept {
std::size_t result = 0;
for(const auto& object: range)
hash_combine(result, object);
return result;
}
};
template<typename T1, typename T2>
struct std::hash<std::pair<T1, T2>> {
std::size_t operator()(const std::pair<T1, T2>& pair) const noexcept {
return ::hash(pair.first, pair.second);
}
};
template<typename ...Ts>
struct std::hash<std::tuple<Ts...>> {
std::size_t operator()(const std::tuple<Ts...>& tuple) const noexcept {
auto to_apply = [](const auto& ...objects) { return ::hash(objects...); };
return std::apply(to_apply, tuple);
}
};
If you want to see it in action: Godbolt link
Leave a Reply